Updating the ConnectX-7 Firmware
After replacing or installing the ConnectX-7 cards, make sure the firmware on the cards is up to date.
Download the v28.36.2024 firmware from https://network.nvidia.com/support/firmware/connectx7ib/.
Download the firmware for both OPN options.
Transfer the firmware ZIP file to the DGX system and extract the archive.
Update the firmware on the cards that are used for cluster communication:
sudo mstflint -d /sys/bus/pci/devices/0000:5e:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:dc:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:c0:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:18:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:40:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:4f:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:ce:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:9a:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX750500B-0D00_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b
Update the firmware on the cards that are used for storage communication:
sudo mstflint -d /sys/bus/pci/devices/0000:aa:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX755206AS-NEA_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b sudo mstflint -d /sys/bus/pci/devices/0000:29:00.0/config -i fw-ConnectX7-rel-28_36_2024-MCX755206AS-NEA_Ax-UEFI-14.29.14-FlexBoot-3.6.901.signed.bin b
Reboot the system for the firmware update to take effect:
sudo reboot
After the system starts, log in and confirm the firmware versions are all the same:
$ cat /sys/class/infiniband/mlx5_*/fw_ver