Version 1.1.3
Highlights
Added support
Support for Gen5 NVME drives.
U.2 drive temperature sensor fix.
Updated power supply firmware.
Included the latest GPU tray firmware.
Included the latest network (cluster and storage) card firmware.
Added support for securing KCS.
The
nvfwupd
command is updated with the following enhancements:Support for abbreviated firmware update package names.
Enhanced the
show_update_progress
output to provide a full status report for Redfish.Support for custom log file path.
The command exits with an error code
1
for any update failure or tool failure.
BMC Fixes
Fixed where SEL logs might fill up for NVMe drives.
Fixed low occurrence where HMC might not be visible in the BMC after BMC reboot.
Added ability to control IPMI visibility for Host (Allow All, Limited Command, Hide).
Higher resolution for CPU and GPU energy telemetry via Redfish.
Improved reliability of Redfish inventory.
Improved overall stability of telemetry collection and handling invalid/missing values.
General improvements to WebUI.
Firmware Package Details
This firmware release supports the following hardware:
NVIDIA DGX H100
This firmware release supports the following operating systems:
NVIDIA DGX OS 6.1, 6.0.11, and higher
NVIDIA DGX Software for EL9.2, 23.12 and 23.08
NVIDIA DGX Software for EL8 23.08
Refer to the NVIDIA Base OS documentation for more information about the operating systems.
You can download firmware packages from the NVIDIA Enterprise Support Portal at https://enterprise-support.nvidia.com/s/.
Download two firmware package files:
Components |
Sample File Name |
---|---|
Combined Archive |
The combined archive includes the firmware for the system components, firmware for the GPU tray, and the nvfwupd executable. |
Motherboard Tray |
|
GPU Tray |
|
If you are updating from 1.1.1, the total update time is approximately
88 minutes for the CPU tray using sequential updates.
33 minutes for the CPU tray using parallel updates.
11 minutes for the GPU tray using parallel updates.
The following table shows the information about component firmware versions and update time breakdown.
Component
|
Version
|
Update time
from 1.1.1
(minutes)
|
---|---|---|
Host BMC |
24.01.05 Refer to DGX H100 System BMC Changes for the list of changes. |
25 |
Host BMC EROT |
04.0026 |
2 |
SBIOS EROT |
04.0026 |
0 |
SBIOS |
v1.01.03 Refer to DGX H100 System SBIOS Changes for the list of changes. |
7 |
Motherboard CPLD |
0.2.1.8 |
18 |
Midplane CPLD |
0.2.1.1 |
14 |
PSU (Delta ECD16020137) |
Primary 2.4
Secondary 2.1
Community 2.2
|
PSU_0: 2
PSU_1: 2
PSU_2: 2
PSU_3: 2
PSU_4: 2
PSU_5: 2
|
Broadcom Gen5
PCIe Switch
(PEX89072-B01)
|
Switch 0: v0.0.7
Switch 1: v1.0.7
|
Switch 0: 1
Switch 1: 1
|
Astera Labs Gen5 PCIe Retimer
(PT5161L)
|
v2.07.19 |
Retimer 0: 3
Retimer 1: 3
|
Network (Cluster) Card - ConnectX-7 |
v28.39.1002 |
|
Network (Storage) Card - ConnectX-7 |
v28.39.1002 |
|
VBIOS (H100 80GB) |
96.00.89.00.01 |
GPU Tray (total): 11 |
NVSwitch (GPU Tray) |
96.10.4A.00.01 |
|
EROT (GPU Tray) |
02.0150 |
|
HMC (GPU Tray) |
HGX-22.10-1-rc57 |
|
FPGA (GPU Tray) |
2.37 |
|
PCIe Switch (GPU Tray) |
1.7.5F |
|
Astera Labs Gen5 PCIe Retimer (GPU Tray)
(PT5161L)
|
2.07.19 |
|
Intel 10G Ethernet |
v3.60 |
|
Intel 50G Ethernet |
v2.5 |
|
M.2 NVMe
(Samsung PM9A3)
|
GDC7502Q |
|
M.2 NVMe
(Micron 7450)
|
E2MU200 |
|
U.2 Kioxia CM6 |
1.0.7 |
|
U.2 Samsung
(EVT2 PM1733)
|
MPK95B5Q |
|
U.2 Samsung
(Gen5 PM1743)
|
OPPA3B5Q |
|
FRU |
0.6 |
|
TPM |
v15.21 |
Firmware Update Procedure
Refer to Firmware Update Steps.