DGX A100 System Firmware Update Container Version 20.05.12
The DGX Firmware Update container version 20.05.12 is available.
Package name:
nvfw-dgxa100_20.05.12_200603.tar.gz
Run file name:
nvfw-dgxa100_20.05.12_200603.run
Image name:
nvfw-dgxa100:20.05.12
Highlights and Changes in this Release
This release is supported with the following DGX OS software -
DGX OS 4.99.8 or later
Enabled BMC Secure Flash
Enabled PCI-Compliant DPC and AER error propagation
Implemented critical VBIOS fixes
Contents of the DGX A100 System Firmware Container
This container includes the firmware binaries and update utilities for the firmware listed in the following table.
Component |
Version |
Key Changes |
Update Time |
---|---|---|---|
BMC (via CEC) |
00.12.05 |
Added to container.
|
31 minutes |
SBIOS |
0.23 |
Added to container
|
7 minutes |
Broadcom 88096 PCIe switch board |
1.3 |
Added to container
|
8 minutes |
BMC CEC SPI |
v3.05 |
Added to container |
8 minutes |
PEX88064 Retimer |
0.13.0 |
Updated |
7 minutes |
PEX88080 Retimer |
0.13.0 |
Updated |
7 minutes |
NvSwitch BIOS |
92.10.12.00.01 |
No change |
8 minutes |
VBIOS |
92.00.19.00.01 |
Updated
|
7 minutes |
Updating Components with Secondary Images
Some firmware components provide a secondary image as backup. The following is the policy when updating those components:
SBIOS: The two images are referred to as active and inactive, where the active is the currently running image and the inactive is the backup image. The update container can only update the inactive image. After reboot, the updated image becomes the active image. You can perform the update again to update the current inactive image so that both images are updated.
BMC: The two images are referred to as active and inactive, where the active is the currently running image and the inactive is the backup image. The update container can only update the inactive image. After the update is completed, the updated image becomes the active image. You can perform the update again to update the current inactive image so that both images are updated.
Instructions for Updating Firmware
This section provides a simple way to update the firmware on the system using the firmware update container. It includes instructions for performing a transitional update for systems that require it. The commands use the .run file, but you can also use the container image directly.
Perform a transitional update if needed.
Depending on the BMC and MB_CEC versions on the system, you may need to perform a transitional update before updating the BMC and SBIOS to the latest versions.
Check if the transitional update is needed.
$ sudo nvfw-dgxa100_20.05.12_2006003.run run_script --command "fw_transition.py show_version"
The following message appears if a transition update is needed.
BMC/MB_CEC firmware needs update to Active/Inactive, secure boot mode This is a one-time update required for DGXA100. All future updates require BMC in this mode
If the one-time update is required, continue with the next step to perform the transitional update.
If the one-time update is not required, then skip to step 2.
Perform the transitional update.
$ sudo nvfw-dgxa100_20.05.12_2006003.run run_script --command "fw_transition.py update_fw" $ sudo reboot
Verify that BMC (both images) and the MB_CEC are up to date.
$ sudo nvfw-dgxa100_20.05.12_2006003.run run_script --command "fw_transition.py show_version"
Check if other updates are needed.
$ sudo nvfw-dgxa100_20.05.12_2006003.run show_version
If there is “no” in any up-to-date column for updatable firmware, then continue with the next step.
If all up-to-date column entries are “yes”, then no updates are needed and no further action is necessary.
Perform the final update for all firmware supported by the container and reboot the system.
$ sudo nvfw-dgxa100_20.05.12_2006003.run update_fw all $ sudo reboot
Note
The
update_fw all
command updates the inactive BMC and SBIOS images only. After rebooting the system, the updated images become “active”. You can then update the inactive images usingnvfw-dgxa100_20.05.12_2006003.run update_fw [BMC] [SBIOS] --inactive
as needed.
You can verify the update by issuing the following.
$ sudo nvfw-dgxa100_20.05.12_2006003.run show_version
Expected output.
BMC DGX
=========
Image Id Status Location Onboard Version Manifest up_to_date
0:Active Boot Online Local 00.12.05 00.12.05 yes
1:Inactive Updatable Local 00.12.05 00.12.05 yes
CEC
============
Onboard Version Manifest up-to-date
MB_CEC(enabled) 3.05 3.05 yes
SBIOS
=======
Image Id Method Onboard Version Manifest up_to_date
0:Inactive Updatabl afulnx 0.24 0.24 yes
1:Active Boot 0.24 0.24 yes
Video BIOS
============
Bus Model Onboard Version Manifest up-to-date
0000:07:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
0000:0f:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
0000:47:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
0000:4e:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
0000:87:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
0000:90:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
0000:b7:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
0000:bd:00.0 A100-SXM4-40GB 92.00.19.00.01 92.00.19.00.01 yes
Switches
============
PCI Bus# Model Onboard Version Manifest up-to-date
DGX - 0000:91:00.0(U261) 88064_Retimer 0.13.0 0.13.0 yes
DGX - 0000:88:00.0(U260) 88064_Retimer 0.13.0 0.13.0 yes
DGX - 0000:4f:00.0(U262) 88064_Retimer 0.13.0 0.13.0 yes
DGX - 0000:48:00.0(U225) 88080_Retimer 0.13.0 0.13.0 yes
DGX - 0000:c4:00.0 LR10 92.10.12.00.01 92.10.12.00.01 yes
DGX - 0000:c5:00.0 LR10 92.10.12.00.01 92.10.12.00.01 yes
DGX - 0000:c2:00.0 LR10 92.10.12.00.01 92.10.12.00.01 yes
DGX - 0000:c6:00.0 LR10 92.10.12.00.01 92.10.12.00.01 yes
DGX - 0000:c3:00.0 LR10 92.10.12.00.01 92.10.12.00.01 yes
DGX - 0000:c7:00.0 LR10 92.10.12.00.01 92.10.12.00.01 yes
DGX - 0000:01:00.0(U1) PEX88096 1.3 1.3 yes
DGX - 0000:81:00.0(U3) PEX88096 1.3 1.3 yes
DGX - 0000:41:00.0(U2) PEX88096 1.3 1.3 yes
DGX - 0000:b1:00.0(U4) PEX88096 1.3 1.3 yes