Kubernetes Installation#

NVIDIA Mission Control (NMC) uses a dual-cluster Kubernetes architecture to separate administrative services from user workloads.

Administrative Cluster (k8s-admin)#

The k8s-admin control plane is created during initial deployment. This cluster is the primary management layer. It hosts core infrastructure services and operational components, including:

  • Autonomous Job Recovery

  • Autonomous Hardware Recovery

  • Domain Power Service

  • Observability stack

User Cluster (k8s-user)#

The k8s-user control plane is provisioned downstream in the installation lifecycle. This cluster is specifically deployed during the execution of the Run:ai integration procedure and is dedicated to managing end-user GPU resources and AI training workloads.

To proceed with the deployment of the Administrative Cluster in an air-gapped environment, refer to the following reference documentation to make the necessary preparations: https://docs.nvidia.com/base-command-manager/manuals/11/containerization-manual.pdf#containerizationmanual-kubernetesairgapinstallation.airgapped-bcm-kubernetes-installation