Version 19.10.7

The DGX-1 Firmware Update container version 19.10.7 is available.

  • Package name: nvfw-dgx1_19.10.7.tar.gz

  • Image name: nvfw-dgx1:19.10.7

  • Run file name: nvfw-dgx1_19.10.7.run

Obtain the files from the NVIDIA Enterprise Support announcement System Firmware Upgrade 19.10.7 for all NVIDIA DGX-1 Server (requires login).

Contents of the DGX-1 Firmware Update Container

This container includes the firmware binaries and update utilities for the firmware listed in the following table.

Component

Version

Key Changes

BMC

3.36.30

  • Added HTML5 support for the Remote Console

  • Removed Java-based Remote Console

Note

Be sure to clear your browser cache to see the new Remote Console.

SBIOS

S2S_3A10

  • Incorporated Intel microcode to mitigate new side channel attacks (Zombieload v1)

SSD (Samsung SM863A)

GXM1103Q

No change from previous release.

VBIOS (DGX-1 with V100, 16 GB)

88.00.18.00.01

No change from previous release.

VBIOS (DGX-1 with V100, 32 GB)

88.00.80.00.04

No change from previous release.

VBIOS (DGX-1 with P100)

86.00.41.00.05

No change from previous release.

PSU

00.03.07

No change from previous release.

Changes in the Container in this Release

  • Fixed unexpected error appearing upon exiting the container after successful PSU update.

  • Fixed BMC update failing with an unexpected error.

  • Fixed show_version command reporting ” ??? ” for the VBIOS version.

  • Fixed firmware update errors on EL7-19.07.

  • Fixed update output only showing the last VBIOS updated, instead of listing all the VBIOS’s updated.

Special Notes

Note

If updating the BMC from any version earlier than 3.27.30, the update can take from 30 to 50 minutes to complete.

  • When updates to the BMC or PSU are initiated,

    • The BMC is (cold) reset to be put in a known good state before the update, then

    • Additional logs are gathered for troubleshooting purposes and made available in /var/log/comp_fw_log.txt.

      The logs are gathered before updating and upon completion of the update or in the event of an update failure.

  • To prevent NVSM services from interfering with BMC and PSU updates, the container stops the following services before applying the update:

    • nvsm-apis-gpumonitor

    • nvsm-apis-plugin-storage

    • nvsm-apis-selwatcher

    • nvsm-apis-plugin-memory

    • nvsm-apis-plugin-environment

    • nvsm-sys-dshmnvsm-env-dshm

    • nvsm-storage-dshm

    System health monitor will not be available until firmware update completes.

  • For the PSU update, the container implements a protective check which requires the system to be fully redundant (all four supplies are installed and in a healthy state) in order for the update to occur.

    If you are using only three of the four PSUs, the full power redundancy requirement can be overridden with the Docker run environment (DGX_MAX_PSU) as follows.

    docker run -e DGX_MAX_PSU=3 --privileged -ti -v /:/hostfs <container_name> update_fw