Base Command Manager 10
About this Document
This document provides important release-specific considerations for Base Command Manager 10 and Base Command Manager Essentials 10.
Introduction to Base Command Manager
NVIDIA Base Command Manager provides cluster management software for streamlining cluster provisioning, workload management, and infrastructure monitoring. It provides all the tools for deploying and managing an AI data center.
Version 10 is a major release that merges the Bright Cluster Manager software into Base Command Manager. It integrates the same functionality provided by Bright Cluster Manager 9.2 (except where stated explicitly).
Base Command Manager is now also included in NVIDIA AI Enterprise. NVIDIA Base Command Manager Essentials comprises the features of NVIDIA Base Command Manager that are certified for use with NVIDIA AI Enterprise.
Note
Base Command Manager 10 is licensed on a per-GPU base. This differs from the node-base licensing model of Bright Cluster Manager. Customers with active support subscriptions using Bright Cluster Manager 9.2 and earlier can upgrade to Base Command Manager 10 by exchanging their current licenses for GPU-based Base Command Manager 10 licenses at no cost.
Contact sw-bright-sales-ops@NVIDIA.onmicrosoft.com for more information about licensing.
For additional documentation and information about Base Command Manager, refer to
Base Command Manager 10 Features
The following are the key features of Base Command Manager 10:
Per-GPU licensing model
Base Command Manager 10 has adopted a GPU-based licensing model.
The GPU-based licensing model is consistent with other NVIDIA products, such as NVIDIA AI Enterprise.
Improvements for deploying and managing Kubernetes
The Kubernetes installation wizard now uses the standard upstream
kubeadm
tool for Kubernetes deployments.The kubernetes deployment now also used the default upstream packages from https://kubernetes.io instead of repackaged versions.
Upgrading Kubernetes now follows the standard upgrade procedures, which are described in Upgrading kubeadm clusters | Kubernetes.
New Cluster API (CAPI) support
The Cluster API (CAPI) provides declarative APIs and tooling for simplifying provisioning and managing multiple Kubernetes clusters.
The new
cm-kubernetes-capi-setup
wizard helps administrators to configure and manage CAPI clusters.The CAPI implementation is a derivative of the Bring Your Own Host (BYOH) CAPI provider.
Cluster On-Demand (COD) Improvements
Base Command Manager now includes support for Oracle Cloud Infrastructure (OCI) with the
cm-cod-oci wizard
.Improvements for Amazon Web Services (AWS) now include the creation of clusters that span multiple regions and AWS FSx file system on Ubuntu.
Cluster On-Demand can now automatically detect memory and GPU instances for cloud nodes.
SLURM improvements
NVIDIA Spectrum Switch Support
Base Command Manager can now deploy and manage Cumulus Linux on NVIDIA Spectrum Switches.
Cumulus Linux supports Zero-touch Provisioning and provides a robust API through the “NVIDIA User Experience (NVUE)”.
Base Command Manager deploys the
cm-litedaemon
, a version of CMDaemon for Cumulus Linux that enables switch management and monitoring.
NVIDIA BlueField DPU Support
Base Command Manager can now manage the firmware, network configurations, and healthcheck/monitoring of the BlueField-2 and Bluefield-3 DPUs.
NVIDIA DGX A100 and H100 support
Base Command Manager includes system images for DGX A100 and DGX H100, which are based on DGX OS 6.
The system image includes all the necessary software for providing an optimized software stack for AI applications.
NVIDIA SuperPOD and NVIDIA BasePOD Support
Base Command Manager 10 makes it easy to deploy on DGX SuperPOD and DGX BasePOD clusters.
Two additional tools:
cm-pod-setup
andbcm-netautogen
assisting administrators in configuring a SuperPOD cluster and managing network configurations.
Base Command Manager is now included in the NVIDIA AI Enterprise software platform
Base Command Manager 10 is now included in NVIDIA AI Enterprise.
NVIDA AI Enterprise is an end-to-end, cloud-native software platform for streamlining development and deployment of production-grade AI applications.
The list of supported and recommended components and versions can be found in the Base Command Manager Feature Matrix.
Improvements to the Jupyter Kubernetes Operators Manager for Jupyter Notebooks
The Jupyter Kubernetes Operator for Jupyter Notebooks includes many improvements. The operator can now also:
Manage Kubernetes Pods,
Create and delete PostgreSQL instances, which can be attached to running notebooks,
Run Apache Spark jobs for large-scale data analytics applications,
Create and manage Persistent Volume Claims (VPC), which can be attached to running notebooks.
Support moving data between different PVCs, and
Support multi-factor authentication (MFA).