Scope#
This document highlights suggested software package updates and steps to perform updates without considering cluster upgrade (or BCM Major version change) for DGX BasePOD and SuperPOD.
This document (June 2025 version) provides update recommendations for DGX H100/H200 BasePOD and SuperPOD customers using BCM version 10. The document outlines a minor update process, specifically for environments running BCM version 10.23.12, and describes the steps required to safely update to BCM version 10.25.03.
For DGX SuperPOD customers, before you update any installed software stack, always consult the respective release notes for the latest information about available components. Refer to the DGX SuperPOD release notes for the latest component versions.
Note
Storage solution and recommendation are not in scope for this document. Air-Gapped deployments are not in scope for this document.
The table below is a reference to the update path followed as part of creating this document. Before you update any installed software stack, always consult the respective documentation for the latest information about available components.
Component |
Old Version |
Updated Version |
Update Source |
|---|---|---|---|
BCM CMD |
10.23.12 |
10.25.03 |
Customer Portal ISO BCM/Ubuntu Repo |
BCM OS |
Ubuntu 22.04.2 LTS |
Ubuntu 22.04.5 LTS |
Customer Portal ISO BCM/Ubuntu Repo |
DGX Release |
6.1.0 |
6.3.2 |
BCM/DGX/CUDA/Ubuntu Repository |
DGX OS |
Ubuntu 22.04.2 LTS |
Ubuntu 22.04.5 LTS |
BCM/DGX/CUDA/Ubuntu Repository |
DGX Kernel |
5.15.0-1042-nvidia |
5.15.0-1078-nvidia |
Ubuntu Repository |
DGX GPU Driver |
535.129.03 |
550.163.01 |
Ubuntu Repository |
DGX Firmware |
1.1.1 |
24.09.1 |
Enterprise Support |
Enroot |
3.4.1 |
3.5.0 |
BCM Repository |
CUDA toolkit |
12.2 |
12.4.1 |
CUDA Repository |
DCGM |
3.1.8 |
4.2.3 |
CUDA Repository |
Cumulus OS (TOR/IPMI) |
5.5.0 |
5.11.0 |
CUMULUS Repository |
Slurm |
23.02.6 |
23.02.8 |
BCM Repository |
Mellanox OFED / DOCA |
MLNX_OFED_LINUX-23.10-0.5.5.0 (OFED-internal-24.10-3.2.5) |
MLNX_OFED_LINUX-23.10-4.0.9.1 DOCA_OFED 2.9.3 |
Mellanox/DOCA Page |
UFM Appliance |
v1.8.2.1 |
v1.10.1 |
NVIDIA Licensing |
IB Switch |
v3.11.3002 |
v3.12.2002 |
NVIDIA Licensing |
Kubernetes |
1.28 |
1.31 |
Kubernetes Repo |
Helm Based Operators |
GPU Operator: 24.9.0 Network Operator: 24.4.0 Metallb: 0.14.9 Prometheus Stack: 35.6.0 Kube State Metrics: 5.31.0 |
GPU Operator: 24.9.1 Network Operator: 24.7.0 Metallb: 0.14.9 Prometheus Stack: 72.9.1 Kube State Metrics: 5.36.0 |
Helm Repositories |
Kubernetes Components |
Kubeflow MPI: v1.5.0 MPI Operator: 0.4.0 Calico: v3.29.2 |
Kubeflow MPI: v1.7.0 MPI Operator: 0.6.0 Calico: v3.29.4 |
GitHub/Docker Repository |
Run:ai |
Control Plane: 2.19.58 Cluster: 2.19.58 |
Control Plane: 2.21.25 Cluster: 2.21.25 |
Run:ai Repository |