Updating the ConnectX-7 Firmware

After replacing or installing the ConnectX-7 cards, make sure the firmware on the cards is up to date.

Refer to the NVIDIA DGX H100 Firmware Update Guide to find the most recent firmware version.

  1. Download the firmware from https://network.nvidia.com/support/firmware/connectx7ib/.

    Download the firmware for both OPN options.

  2. Transfer the firmware ZIP file to the DGX system and extract the archive.

  3. Update the firmware on the cards that are used for cluster communication:

    sudo mstflint -d /sys/bus/pci/devices/0000:5e:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:dc:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:c0:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:18:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:40:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:4f:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:ce:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:9a:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX750500B-0D00_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    
  4. Update the firmware on the cards that are used for storage communication:

    sudo mstflint -d /sys/bus/pci/devices/0000:aa:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX755206AS-NEA_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    sudo mstflint -d /sys/bus/pci/devices/0000:29:00.0/config -i fw-ConnectX7-rel-28_39_1002-MCX755206AS-NEA_Ax-UEFI-14.32.12-FlexBoot-3.7.201.signed.bin  b
    
  5. Perform an AC power cycle on the system for the firmware update to take effect.

    Wait for the operating system to boot.

  6. After the system starts, log in and confirm the firmware versions are all the same:

    $ cat /sys/class/infiniband/mlx5_*/fw_ver