Update and Upgrade Best Practices#

Software updates are cumulative, which means that your systems will always receive the latest versions of all installed software components. The packages in the repositories can also be newer than the current BCM release.

Read and evaluate the information and advisories from all relevant releases and later updates. A package can be managed in a software image and the image deployed to nodes.

Note

As a best practice, please backup (clone) known good images before making any changes.

NVIDIA recommends completing any updates to the software images through the chroot environment on the BCM headnode prior to deploying it on the nodes.

Please refer to the step-by-step guidelines described in the ‘Managing A Package in A Software Image and Running It on Nodes’ section of the ‘Post-installation Software Management’ chapter of the BCM administrator manual. The chapter provides instructions for Ubuntu, which is used for DGX BasePOD and SuperPODs.

Note

The apt update does not update packages, instead it refreshes/updates the package repository metadata where the apt install <package> procedure will update/upgrade the package.

This allows the user to control what packages are updated. Performing minor updates on systems using the apt package manager is normally a straightforward process. However, full system updates can become complex due to package dependencies, compatibility considerations, and system-specific configurations. Therefore, it’s essential to understand these interdependencies to ensure a smooth update path.

The suggested update path is as follows:

  1. Update BCM Headnodes

  2. Update DGX OS

  3. Update any additional packages as part of the DGX OS image such as MOFED/DOCA Drivers

The following updates can be done at any time:

  • Update Workload Managers (WLM)

  • DGX Firmware

  • InfiniBand Components

  • NVIDIA Spectrum Switches