Kubernetes Installation#

NVIDIA Mission Control uses a dual-cluster Kubernetes architecture to separate administrative services from user workloads.

Administrative Cluster (k8s-admin)#

The k8s-admin control plane is created during initial deployment. This cluster is the primary management layer. It hosts core infrastructure services and operational components, including:

  • Autonomous Job Recovery

  • Autonomous Hardware Recovery

  • Domain Power Service

  • Observability stack

User Cluster (k8s-user)#

The k8s-user control plane is provisioned downstream in the installation lifecycle. This cluster is specifically deployed during the execution of the Run:ai integration procedure and is dedicated to managing end-user GPU resources and AI training workloads.

To proceed with the deployment of the Administrative Cluster in an air-gapped environment, refer to Air-Gapped BCM Kubernetes Installation in the BCM 11 Containerization Manual.