Firmware Changes for NVIDIA DGX H100/H200 Systems#

BMC Changes for DGX H100/H200 Systems#

Changes in 26.03.03#

  • Fixed occasional IPMI hang with NVSM.

  • Fixed debug log collection.

  • Improved BMC performance.

  • Enhanced SEL logging performance and minimized thermal and fan speed monitoring messages.

  • Improved BMC stability.

Changes in 25.12.11#

  • Updated OpenSSH to version openssh-10.0p1-2.

  • Incorporated security fixes in bootloader to address squashfs vulnerabilities.

  • Resolved an issue where fans were running at maximum speed.

  • Resolved an issue where differing time zone settings between the BMC and HMC caused continuous communication recovery attempts.

  • Enhanced BMC stability.

Changes in 25.06.27#

  • Resolved the issue of high fan speed in zone 1 during system idle.

  • Fixed an issue that intermittently fetched an invalid HMC configuration path.

  • Resolved intermittent BMC Redfish unresponsiveness.

Changes in 25.02.12#

  • Improved handling for firmware update of PSUs.

  • Improved reporting on ERoT status.

  • Provided multiple stability fixes.

  • Improved user management in Redfish.

  • Enhanced stability in communication between the BMC and the HMC.

  • Corrected power supply sensors where no readings were shown.

  • Fixed the inability to query GPU firmware version via Redfish.

  • Addressed an issue where some telemetry information from the GPU tray was unavailable.

  • Resolved an issue related to chassis power cycling using Redfish.

  • Corrected fan readings so the fan will reflect 0 if it is not present or stops spinning.

  • Fixed an issue where NIC information was allocated to the incorrect device in Redfish.

Changes in 24.09.17#

  • Fixed where BMC configuration might reset after upgrading.

  • Added Redfish API support for creating, modifying, and deleting power policies.

  • Support for deploying firmware update using the Web UI.

  • Redfish Disable Host Interface: keeps redfish functional from BIOS to BMC but prevents the direct path from OS to BMC.

  • Added ability to specify intermediate certificate authorities in a provisioned certificate chain.

  • Included additional Redfish metrics reports.

  • Fixed SNMP, syslog, and rsyslog issues.

  • Added per BMC AES key for encrypting user/password files during the configuration save and restore process.

  • Fixed invalid domain issues in the LDAP/AD settings.

  • Enhanced Redfish diagnostics.

  • General performance improvements in Redfish APIs and IPMI.

  • Added support for ConnectX-7 temperature sensors.

  • Improved resolution for energy counters.

  • Enhanced Remote Media with support for port numbers and domain names.

  • General improvements to the Web UI.

Changes in 24.01.05#

  • Fixed where SEL logs might fill up for NVMe drives

  • Fixed low occurrence where HMC might not be visible in the BMC after BMC reboot

  • Ability to control IPMI visibility for Host (Allow All, Limited Command, Hide)

  • Higher resolution for CPU and GPU energy telemetry via Redfish

  • Improved reliability of Redfish inventory

  • Improved overall stability of telemetry collection and handling invalid/missing values

  • General improvements to WebUI

Changes in 23.09.20#

  • WebUI enhancements

  • Enabled GPU Info in WebUI

  • Enabled NVRAM clear via Redfish

  • Disabled RMCP / MD5 Auth Support after factory reset

  • Enabled EROT background copy

  • Enabled default SNMPv3 MIB

  • The BMC update includes software security enhancements. Refer to the NVIDIA DGX H100 - August 2023 Security Bulletin for details.

SBIOS Changes for DGX H100/H200 Systems#

Changes in v1.6.7#

  • Fixed an issue where the total memory size from the free -m command output is smaller than installed.

  • Updated to display full device path for HDD devices under the Redfish BootOptions page.

  • Fixed an issue to mandate administrator password after version 24.09.1 if SED drive is encrypted.

Changes in v1.05.03#

  • DIMM that experienced uncorrectable errors at runtime will be mapped out on the next boot.

  • Exposed the C1AutoDemotion, C1AutoUnDemotion, and C6Enable setup options.

  • Moved the CPU setup options page to under the Advanced page in the setup UI.

  • Added a setup option to restrict host access via IPMI.

  • Provided the NvramVarsProtectionInOs setup option to prevent the OS from changing the NVRAM at runtime.

  • Implemented uncorrectable error rate limiting, disabled CSMI (correctable system management interrupts) on error flooding and on the core that reported MLC (middle-level cache) yellow state, and SEL logging when ANF (advisory non-fatal error) threshold was crossed.

  • Changed the SncEn default setting to disable.

Changes in v1.01.03#

  • Added support for securing KCS

Changes in v1.01.01#

  • Fixed Boot options labeling for NIC ports

  • Fix for U.2 bay slot numbering

  • Set RestoreROWritePerf option to expert mode only

  • Expose TDX and IFS options in expert user mode only

GPU Tray Firmware Changes#

Issues Fixed in 1.9.0#

  • The HGX_GPU_SXM_1~8_Power_0, HGX_GPU_SXM_1~8_DRAM_0, and HGX_GPU_SXM_1~8_Engergy_0 sensors sometimes show no reading.

    The GPU power and energy sensor telemetry becomes intermittently unavailable after several consecutive power cycles. Under these conditions, the sensors might report NaN or null readings on Redfish.

    Workaround

    Due to the intermittent nature of the issue, no reliable workaround is currently known. Unloading and loading the driver with persistence mode might help recover the telemetry under certain conditions.

    This issue has been fixed.

  • Set_declare SMBPBI master caps return success.

    The undocumented SMBPBI opcode 0x8 does not return ERR_OPCODE in conformance with the SMBPBI specification. This opcode is not supported on the Hopper HGX 8-GPU baseboard and should not be used.

Issues Fixed in 1.8.1#

  • The ECC counters are enabled with CC/PPCIe mode.

Issues Fixed in 1.8.0#

  • Partner Security Bulletin: HGX/DGX vBIOS and LS10: July 2025

    Details

    NVIDIA became aware of two new vulnerabilities in the FSP code that is used in the NVIDIA vBIOS and LS10 firmware for Data Center DGX and HGX products (CVE-2025-23302 and CVE-2025-23301. These vulnerabilities can be exploited if a privileged user gains unintended access to the FSP in the GPU. NVIDIA has provided a patched firmware version to handle these vulnerabilities.

    Impact

    An attacker with root access to the FSP is capable of modifying register data that can lead to exploitations and cause denial of service.

    Mitigation

    Update the following firmware to version 1.8.1 for the NVIDIA HGX and DGX Products. If you have questions regarding this bulletin, contact your NVIDIA Program Manager. Visit the NVIDIA Product Security page to learn more about the vulnerability management process followed by the NVIDIA Product Security Incident Response Team (PSIRT). Refer to NVOnline: 1138058 for more information.