Firmware Changes for NVIDIA DGX H100/H200 Systems#
BMC Changes for DGX H100/H200 Systems#
Changes in 26.03.03#
Fixed occasional IPMI hang with NVSM.
Fixed debug log collection.
Improved BMC performance.
Enhanced SEL logging performance and minimized thermal and fan speed monitoring messages.
Improved BMC stability.
Changes in 25.12.11#
Updated OpenSSH to version openssh-10.0p1-2.
Incorporated security fixes in bootloader to address squashfs vulnerabilities.
Resolved an issue where fans were running at maximum speed.
Resolved an issue where differing time zone settings between the BMC and HMC caused continuous communication recovery attempts.
Enhanced BMC stability.
Changes in 25.06.27#
Resolved the issue of high fan speed in zone 1 during system idle.
Fixed an issue that intermittently fetched an invalid HMC configuration path.
Resolved intermittent BMC Redfish unresponsiveness.
Changes in 25.02.12#
Improved handling for firmware update of PSUs.
Improved reporting on ERoT status.
Provided multiple stability fixes.
Improved user management in Redfish.
Enhanced stability in communication between the BMC and the HMC.
Corrected power supply sensors where no readings were shown.
Fixed the inability to query GPU firmware version via Redfish.
Addressed an issue where some telemetry information from the GPU tray was unavailable.
Resolved an issue related to chassis power cycling using Redfish.
Corrected fan readings so the fan will reflect 0 if it is not present or stops spinning.
Fixed an issue where NIC information was allocated to the incorrect device in Redfish.
Changes in 24.09.17#
Fixed where BMC configuration might reset after upgrading.
Added Redfish API support for creating, modifying, and deleting power policies.
Support for deploying firmware update using the Web UI.
Redfish Disable Host Interface: keeps redfish functional from BIOS to BMC but prevents the direct path from OS to BMC.
Added ability to specify intermediate certificate authorities in a provisioned certificate chain.
Included additional Redfish metrics reports.
Fixed SNMP, syslog, and rsyslog issues.
Added per BMC AES key for encrypting user/password files during the configuration save and restore process.
Fixed invalid domain issues in the LDAP/AD settings.
Enhanced Redfish diagnostics.
General performance improvements in Redfish APIs and IPMI.
Added support for ConnectX-7 temperature sensors.
Improved resolution for energy counters.
Enhanced Remote Media with support for port numbers and domain names.
General improvements to the Web UI.
Changes in 24.01.05#
Fixed where SEL logs might fill up for NVMe drives
Fixed low occurrence where HMC might not be visible in the BMC after BMC reboot
Ability to control IPMI visibility for Host (Allow All, Limited Command, Hide)
Higher resolution for CPU and GPU energy telemetry via Redfish
Improved reliability of Redfish inventory
Improved overall stability of telemetry collection and handling invalid/missing values
General improvements to WebUI
Changes in 23.09.20#
WebUI enhancements
Enabled GPU Info in WebUI
Enabled NVRAM clear via Redfish
Disabled RMCP / MD5 Auth Support after factory reset
Enabled EROT background copy
Enabled default SNMPv3 MIB
The BMC update includes software security enhancements. Refer to the NVIDIA DGX H100 - August 2023 Security Bulletin for details.
SBIOS Changes for DGX H100/H200 Systems#
Changes in v1.6.7#
Fixed an issue where the total memory size from the
free -mcommand output is smaller than installed.Updated to display full device path for HDD devices under the Redfish BootOptions page.
Fixed an issue to mandate administrator password after version 24.09.1 if SED drive is encrypted.
Changes in v1.05.03#
DIMM that experienced uncorrectable errors at runtime will be mapped out on the next boot.
Exposed the
C1AutoDemotion,C1AutoUnDemotion, andC6Enablesetup options.Moved the CPU setup options page to under the Advanced page in the setup UI.
Added a setup option to restrict host access via IPMI.
Provided the
NvramVarsProtectionInOssetup option to prevent the OS from changing the NVRAM at runtime.Implemented uncorrectable error rate limiting, disabled CSMI (correctable system management interrupts) on error flooding and on the core that reported MLC (middle-level cache) yellow state, and SEL logging when ANF (advisory non-fatal error) threshold was crossed.
Changed the
SncEndefault setting todisable.
Changes in v1.01.03#
Added support for securing KCS
Changes in v1.01.01#
Fixed Boot options labeling for NIC ports
Fix for U.2 bay slot numbering
Set RestoreROWritePerf option to expert mode only
Expose TDX and IFS options in expert user mode only
GPU Tray Firmware Changes#
Issues Fixed in 1.9.0#
The HGX_GPU_SXM_1~8_Power_0, HGX_GPU_SXM_1~8_DRAM_0, and HGX_GPU_SXM_1~8_Engergy_0 sensors sometimes show no reading.
The GPU power and energy sensor telemetry becomes intermittently unavailable after several consecutive power cycles. Under these conditions, the sensors might report
NaNornullreadings on Redfish.Workaround
Due to the intermittent nature of the issue, no reliable workaround is currently known. Unloading and loading the driver with persistence mode might help recover the telemetry under certain conditions.
This issue has been fixed.
Set_declare SMBPBI master caps return success.
The undocumented SMBPBI opcode 0x8 does not return ERR_OPCODE in conformance with the SMBPBI specification. This opcode is not supported on the Hopper HGX 8-GPU baseboard and should not be used.
Issues Fixed in 1.8.1#
The ECC counters are enabled with CC/PPCIe mode.
Issues Fixed in 1.8.0#
Partner Security Bulletin: HGX/DGX vBIOS and LS10: July 2025
Details
NVIDIA became aware of two new vulnerabilities in the FSP code that is used in the NVIDIA vBIOS and LS10 firmware for Data Center DGX and HGX products (CVE-2025-23302 and CVE-2025-23301. These vulnerabilities can be exploited if a privileged user gains unintended access to the FSP in the GPU. NVIDIA has provided a patched firmware version to handle these vulnerabilities.
Impact
An attacker with root access to the FSP is capable of modifying register data that can lead to exploitations and cause denial of service.
Mitigation
Update the following firmware to version 1.8.1 for the NVIDIA HGX and DGX Products. If you have questions regarding this bulletin, contact your NVIDIA Program Manager. Visit the NVIDIA Product Security page to learn more about the vulnerability management process followed by the NVIDIA Product Security Incident Response Team (PSIRT). Refer to NVOnline: 1138058 for more information.