Base Command Manager 10

About this Document

This document provides important release-specific considerations for Base Command Manager 10 and Base Command Manager Essentials 10.

Introduction to Base Command Manager

NVIDIA Base Command Manager provides cluster management software for streamlining cluster provisioning, workload management, and infrastructure monitoring. It provides all the tools for deploying and managing an AI data center.

Version 10 is a major release that merges the Bright Cluster Manager software into Base Command Manager. It integrates the same functionality provided by Bright Cluster Manager 9.2 (except where stated explicitly).

Base Command Manager is now also included in NVIDIA AI Enterprise. NVIDIA Base Command Manager Essentials comprises the features of NVIDIA Base Command Manager that are certified for use with NVIDIA AI Enterprise.

Note

Base Command Manager 10 is licensed on a per-GPU base. This differs from the node-base licensing model of Bright Cluster Manager. Customers with active support subscriptions using Bright Cluster Manager 9.2 and earlier can upgrade to Base Command Manager 10 by exchanging their current licenses for GPU-based Base Command Manager 10 licenses at no cost.

Contact sw-bright-sales-ops@NVIDIA.onmicrosoft.com for more information about licensing.

For additional documentation and information about Base Command Manager, refer to

Base Command Manager 10 Features

The following are the key features of Base Command Manager 10:

Per-GPU licensing model

  • Base Command Manager 10 has adopted a GPU-based licensing model.

  • The GPU-based licensing model is consistent with other NVIDIA products, such as NVIDIA AI Enterprise.

Improvements for deploying and managing Kubernetes

  • The Kubernetes installation wizard now uses the standard upstream kubeadm tool for Kubernetes deployments.

  • The kubernetes deployment now also used the default upstream packages from https://kubernetes.io instead of repackaged versions.

  • Upgrading Kubernetes now follows the standard upgrade procedures, which are described in Upgrading kubeadm clusters | Kubernetes.

New Cluster API (CAPI) support

  • The Cluster API (CAPI) provides declarative APIs and tooling for simplifying provisioning and managing multiple Kubernetes clusters.

  • The new cm-kubernetes-capi-setup wizard helps administrators to configure and manage CAPI clusters.

  • The CAPI implementation is a derivative of the Bring Your Own Host (BYOH) CAPI provider.

Cluster On-Demand (COD) Improvements

  • Base Command Manager now includes support for Oracle Cloud Infrastructure (OCI) with the cm-cod-oci wizard.

  • Improvements for Amazon Web Services (AWS) now include the creation of clusters that span multiple regions and AWS FSx file system on Ubuntu.

  • Cluster On-Demand can now automatically detect memory and GPU instances for cloud nodes.

SLURM improvements

  • Base Command Manager 10 includes data and cache sharing options for Pyxis and Enroot.

  • The gres.conf configuration file for SLURM is now configured automatically from the MIG autodetection.

NVIDIA Spectrum Switch Support

  • Base Command Manager can now deploy and manage Cumulus Linux on NVIDIA Spectrum Switches.

  • Cumulus Linux supports Zero-touch Provisioning and provides a robust API through the “NVIDIA User Experience (NVUE)”.

  • Base Command Manager deploys the cm-litedaemon, a version of CMDaemon for Cumulus Linux that enables switch management and monitoring.

NVIDIA BlueField DPU Support

  • Base Command Manager can now manage the firmware, network configurations, and healthcheck/monitoring of the BlueField-2 and Bluefield-3 DPUs.

NVIDIA DGX A100 and H100 support

  • Base Command Manager includes system images for DGX A100 and DGX H100, which are based on DGX OS 6.

  • The system image includes all the necessary software for providing an optimized software stack for AI applications.

NVIDIA SuperPOD and NVIDIA BasePOD Support

  • Base Command Manager 10 makes it easy to deploy on DGX SuperPOD and DGX BasePOD clusters.

  • Two additional tools: cm-pod-setup and bcm-netautogen assisting administrators in configuring a SuperPOD cluster and managing network configurations.

Base Command Manager is now included in the NVIDIA AI Enterprise software platform

  • Base Command Manager 10 is now included in NVIDIA AI Enterprise.

  • NVIDA AI Enterprise is an end-to-end, cloud-native software platform for streamlining development and deployment of production-grade AI applications.

  • The list of supported and recommended components and versions can be found in the Base Command Manager Feature Matrix.

Improvements to the Jupyter Kubernetes Operators Manager for Jupyter Notebooks

  • The Jupyter Kubernetes Operator for Jupyter Notebooks includes many improvements. The operator can now also:

    • Manage Kubernetes Pods,

    • Create and delete PostgreSQL instances, which can be attached to running notebooks,

    • Run Apache Spark jobs for large-scale data analytics applications,

    • Create and manage Persistent Volume Claims (VPC), which can be attached to running notebooks.

    • Support moving data between different PVCs, and

    • Support multi-factor authentication (MFA).