Software Update#

  1. Clone the category in BCM

    cmsh
    category
    clone <dgx-gb200> <new-dgx-gb200>
    commit
    
  2. Clone the OS image

    cmsh
    softwareimage
    clone <dgx-image> <new-dgx-image>
    commit
    

    Set the new Category to the new image

    cmsh
    category
    set <new-dgx-gb200> softwareimage <new-dgx-image>
    commit
    

    See also

    For additional details on cloning a Config in BCM, see Re-provision a New Image.

  3. Enter the Image to make changes

    cm-chroot /cm/images/new-dgx-image/
    
  4. Create DOCA Repo based on Architecture

    X86:

    dd status=none of=/etc/apt/sources.list.d/doca.sources << EOF
    Types: deb
    URIs: https://linux.mellanox.com/public/repo/doca/baseos8-latest/ubuntu24.04/x86_64/
    Suites: /
    Signed-By: /usr/share/keyrings/GPG-KEY-Mellanox.gpg
    EOF
    

    arm64:

    dd status=none of=/etc/apt/sources.list.d/doca.sources << EOF
    Types: deb
    URIs: https://linux.mellanox.com/public/repo/doca/baseos8-latest/ubuntu24.04/arm64-sbsa/
    Suites: /
    Signed-By: /usr/share/keyrings/GPG-KEY-Mellanox.gpg
    EOF
    
  5. Install the latest DGX OS packages

    Before proceeding, check the software bill of materials (SBOM) to verify the expected component versions for your release.

    Note

    If updating to SW 1.3.6, update the DOCA GPG key before proceeding:

    sudo wget https://linux.mellanox.com/public/repo/doca/public_keys/nvidia-doca-debian-gpg-public-key.gpg -o /usr/share/keyrings/nvidia-doca-debian-gpg-public-key.gpg
    sudo sed -i 's/GPG-KEY-Mellanox.gpg/nvidia-doca-debian-gpg-public-key.gpg/g' /etc/apt/sources.list.d/doca-bos8-latest.sources
    sudo apt update
    
    # Update the package index
    apt update
    
    # Review packages to be upgraded
    apt full-upgrade -s
    
    # Upgrade to the latest version
    apt full-upgrade
    
    # Re-run DKMS build against the newly installed kernel
    sudo dkms autoinstall --force -k <New Installed kernel>
    
    # Re-configure broken packages
    sudo apt -f install -y
    

    Note

    This does not update the BCM Kernel in use.

    See also

    To update the BCM kernel, see Updating the Kernel Version.

    Install MFT, DOCA, NVIDIA driver packages:

    These steps are only required if you need to install a specific driver version. You must use a driver provided in the firmware tarball for your release — replace the .deb filename in the commands below with the one from the tarball that corresponds to the driver version you are installing.

    # Make sure the external repo is pointed to for DOCA packages
    cat /etc/apt/sources.list.d/doca.sources
    
    # Expected output:
    # Types: deb
    # URIs: https://linux.mellanox.com/public/repo/doca/DGX_GBxx_latest_DOCA/ubuntu24.04/arm64-sbsa/
    # Suites: /
    # Signed-By: /usr/share/keyrings/GPG-KEY-Mellanox.gpg
    
    # Install DOCA package
    sudo apt-get update
    sudo apt install doca-all
    
    # Install driver package
    sudo dpkg -i nvidia-driver-local-repo-ubuntu2404-570.158.01_1.0-1_arm64.deb
    sudo cp /var/nvidia-driver-local-repo-ubuntu2404-570.158.01/nvidia-driver-local-5778B6CA-keyring.gpg /usr/share/keyrings/
    sudo mv /etc/apt/sources.list.d/cuda-compute-repo.sources /etc/apt/sources.list.d/cuda-compute-repo.sources.disabled
    sudo apt update
    sudo apt install nvidia-driver-570-open
    sudo apt-get install nvidia-imex-570
    sudo apt-get install nvidia-fabricmanager-570
    sudo apt-get install libnvidia-nscq-570
    

    Verify installations:

    # Check DOCA packages
    sudo dpkg -l | grep <Expected DOCA Ver>
    
    # Check driver package
    sudo dpkg -l | grep <Expected Driver ver>
    

    See also

  6. Save changes into the image

    exit
    
  7. Set compute node to DGX Category

    cmsh
    device
    foreach -n dgx-nodes[XX-XX] (set category <new-dgx-gb200>)
    commit
    
  8. Reboot compute nodes

    reboot -c <new-dgx-gb200>
    
  9. Verify all components have been upgraded