DGX H100/H200 System Firmware Update Guide Version 25.04.3#
Highlights#
Improved handling for firmware update of PSUs.
Improved reporting on ERoT status.
Provided multiple stability fixes.
Improved user management in Redfish.
Enhanced stability in communication between the BMC and the HMC.
BMC Fixes#
Corrected power supply sensors where no readings were shown.
Fixed the inability to query GPU firmware version via Redfish.
Addressed an issue where some telemetry information from the GPU tray was unavailable.
Resolved an issue related to chassis power cycling using Redfish.
Corrected fan readings so the fan will reflect 0 if it is not present or stops spinning.
Fixed an issue where NIC information was allocated to the incorrect device in Redfish.
SBIOS Fixes#
Fixed an issue where the total memory size from the
free -m
command output is smaller than installed.Updated to display full device path for HDD devices under the Redfish BootOptions page.
Fixed an issue to mandate administrator password after version 24.09.1 if SED drive is encrypted.
The nvfwupd Command Updates#
Added support for parallel firmware updates through the YAML configuration file.
Added the
--json
option to theupdate_fw
,show_update_progress
, andforce_update
commands.Added IPv6 support.
Deprecated the
targets
sub-option for multi-target input. Useconfig.yaml
input instead.
Firmware Package Details#
This firmware release supports the following systems:
NVIDIA DGX H100
NVIDIA DGX H200
This firmware release supports the following operating systems:
NVIDIA DGX OS 7.0.2, 6.2.1, 6.1, 6.0.11, and higher
NVIDIA DGX Software EL9-24.12, EL9-24.06, EL9-23.12, and EL9-23.08
NVIDIA DGX Software EL8-24.07, EL8-24.01, and EL8-23.08
For more information about the operating systems, refer to the NVIDIA Base OS documentation.
You can download firmware packages from the NVIDIA Enterprise Support Portal.
The following table shows the firmware package files:
Components |
Sample File Name |
---|---|
Combined archive |
The combined archive includes the firmware for the system components and the firmware for the GPU tray. |
|
|
If you are updating from version 24.09.1, the total update time is approximately
91 minutes for the CPU tray using sequential updating.
34 minutes for the CPU tray using parallel updating.
15 minutes for the GPU tray using parallel updating.
The following table shows the information about component firmware versions and update time breakdown.
Component
|
Version
|
Update Time
from 24.09.1
(Minutes)
|
---|---|---|
Host BMC |
25.02.12 Refer to BMC Changes for DGX H100/H200 Systems for the list of changes. |
28 |
Host BMC ERoT |
04.0058 |
2.75 |
SBIOS ERoT |
04.0058 |
2.75 |
SBIOS |
1.6.7 Refer to SBIOS Changes for DGX H100/H200 Systems for the list of changes. |
8.75 |
Motherboard CPLD |
0.2.1.9 |
18 |
Midplane CPLD |
0.2.1.3 |
13.25 |
PSU (Delta ECD16020137) |
Primary 0204
Secondary 0201
Community 0204
|
PSU_0: 2
PSU_1: 2
PSU_2: 1.87
PSU_3: 1.87
PSU_4: 1.87
PSU_5: 1.87
|
Broadcom Gen5
PCIe Switch
(PEX89072-B01)
|
Switch 0: 0.0.7
Switch 1: 1.0.7
|
Switch 0: 0.75
Switch 1: 0.75
|
Astera Labs Gen5 PCIe Retimer
(PT5161L)
|
2.07.19 |
Retimer 0: 2.5
Retimer 1: 2
|
Network (Cluster) Card - ConnectX-7 |
28.43.2026 |
|
Network (Storage) Card - ConnectX-7 |
28.43.2026 |
|
Network Card - BlueField-3 |
32.43.2024 |
|
VBIOS |
|
GPU Tray (total): 15 |
NVSwitch (GPU Tray) |
96.10.6D.00.01 |
|
ERoT (GPU Tray) |
02.0192 |
|
HMC (GPU Tray) |
HGX-22.10-1-rc79 |
|
FPGA (GPU Tray) |
2.53 |
|
PCIe Switch (GPU Tray) |
1.9.5F |
|
Astera Labs Gen5 PCIe Retimer (GPU Tray)
(PT5161L)
|
2.7.20 |
|
Intel 10G Ethernet |
v3.60 |
|
Intel Ethernet Network Adapter
(E810-C-Q2)
|
v4.50 |
|
M.2 NVMe
(Samsung PM9A3)
|
GDC7502Q |
|
M.2 NVMe
(Micron 7450)
|
E2MU200 |
|
U.2 Kioxia Gen5 CM7 |
B08 |
|
U.2 Samsung
(EVT2 PM1733)
|
MPK95B5Q |
|
U.2 Samsung
(Gen5 PM1743)
|
OPPA7B5Q |
|
FRU |
|
|
TPM |
v15.21/15.23 |
Firmware Update Procedure#
Refer to Firmware Update Steps.