Using the DGX Station A100 Firmware Update ISO

This section describes how to use the DGX Station A100 firmware update ISO to efficiently update the firmware in a large fleet of DGX Station A100 systems.

About the Firmware Update Menu

Once the system boots up to the firmware update ISO, it sets up the environment and launches a firmware update menu. The menu can be used in the following three different modes:

  • Interactive

    This displays a text-based UI with the following choices of actions to take:

    • Start the firmware update container

      This runs the firmware update container using the update_fw all option.

    • Start the firmware update container with custom options

      This runs the firmware update container using custom arguments that you enter into a text box. Separate multiple arguments by a space. Refer to the following example:

      update_fw BMC -f
      

      See Command and Argument Summary for available arguments.

    • Set up connection for automation and Exit

      This sets up an SSH connection (default user name is fwui and default password is fw_update) so you can run automation scripts from a different system. For example, this lets you use Ansible automation.

    • Exit

  • Non-interactive

    This reads the argument from kernel parameter (/proc/cmdline) and then runs the firmware update container automatically.

  • Automation

    This sets up an SSH connection. The default user name is fwui and default password is fw_update. From there you can use automation scripts (for example, Ansible) to perform the firmware update.

Booting to the Firmware Update ISO from a USB Flash Drive

This section describes how to boot to the DGX Station A100 firmware update ISO from a USB flash drive.

Basic Process

Download the ISO image and create a bootable USB drive that contains the ISO image.

Important

Do not use the virtual media from the BMC. If you use virtual media, the BMC will be reset during the update.

Updating the Firmware Automatically

To set up the firmware to update automatically when the system boots up:

  1. Edit the GRUB menu parameters in the ISO at BOOT/GRUB/GRUB.CGF as follows.

    Set fwuc-mode=noninteractive.

    Set the following parameters as needed.

    • fwuc-update_args=<arg1>,<arg2> ...
      
    • fwuc-extra_args=<extra-arg1> ...
      

    See Command and Argument Summary for available arguments.

    The following example boots the firmware update ISO in non-interactive mode, updates the SBIOS without first checking the installed version, and reboots the system after the update.

    menuentry "Start Firmware Update Environment (Non-interactive)" {
        linux /vmlinuz boot=live console=tty0 apparmor=0 elevator=noop nvme-core.multipath=n nouveau.modeset=0 boot-live-env start-systemd-networkd fwuc-mode=noninteractive fwuc-update_args=update_fw,SBIOS,-f fwuc-extra_args=reboot-after-update
        initrd /initrd
    }
    
  2. Create a bootable USB drive that contains the updated ISO.

  3. Boot to the USB drive.

  4. If the FPGA firmware was updated, complete a DC power cycle by issuing the following command.

    $ sudo ipmitool -I lanplus -H ${BMC_IP} -U ${BMC_USER} -P ${BMC_PW} chassis power cycle
    

Booting to the Firmware Update ISO by PXE Boot

This section describes how to PXE boot to the DGX Station A100 firmware update ISO.

  1. See PXE Boot Setup for more information about setting up the DGX Station A100 to PXE boot.

  2. Download the ISO image and mount it.

    $ sudo mount -o loop ~/DGXSTATIONA100_FWUI-24.1.1-2024-01-16-12-06-01.iso /mnt
    
  3. Copy the filesystem.squashfs, initrd and vmlinuz files to the http directory.

    $ sudo mkdir -p /local/http/firmware-update/
    $ sudo cp /mnt/live/filesystem.squashfs /local/http/firmware-update/
    $ sudo cp /mnt/{initrd,vmlinuz} /local/http/firmware-update/
    $ umount /mnt
    

    The new /local/http folder structure should look like this:

    /local/http/
    ├── dgxbaseos-5.x.y
    │   ├── base_os_5.x.y.iso
    │   ├── initrd
    │   └── vmlinuz
    └── firmware-update
        ├── filesystem.squashfs
        ├── initrd
        └── vmlinuz
    
  4. Edit the /local/syslinux/efi64/pxelinux.cfg/default file to add the following menu option content for the Firmware Update OS.

    label Firmware Update Container
        menu label Firmware Update Container
        kernel http://${SERVER_IP}/firmware-update/vmlinuz
        initrd http://${SERVER_IP}/firmware-update/initrd
        append vga=788 initrd=initrd boot=live console=tty0 console=ttyS1,115200n8 apparmor=0 elevator=noop nvme-core.multipath=nouveau.modeset=0 boot-live-env start-systemd-networkd fetch=http://${SERVER_IP}/firmware-update/filesystem.squashfs
    

    Important

    If the system is booting from the LAN port connection (eno1), and the connections are not on the same domain, add live-netdev=eno1 to the append line.

    Example:

    append vga=788 initrd=initrd boot=live console=tty0 apparmor=0 live-netdev=eno1 elevator=noop nvme-core.multipath=n nouveau.modeset=0 boot-live-env start-systemd-networkd fetch=http://${SERVER_IP}/filesystem.squashfs
    
  5. (Optional) To set up the boot configuration to run the container automatically when booting, edit the following parameters at pxelinux.cfg/default:

    Set fwuc-mode=noninteractive.

    Set the following parameters as needed.

    • fwuc-update_args=<arg1>,<arg2> ...
      
    • fwuc-extra_args=<extra-arg1> ...
      

    See Command and Argument Summary for available arguments.

    The following example boots the package in non-interactive mode and updates the SBIOS without first checking the installed version, then reboots the system after the update.

    append vga=788 initrd=initrd boot=live console=tty0 apparmor=0 elevator=noop nvme-core.multipath=n nouveau.modeset=0 fwuc-mode=noninteractive fwuc-update_args=update_fw,SBIOS,-f fwuc-extra_args=reboot-after-updateboot-live-env start-systemd-networkd fetch=http://${SERVER_IP}/filesystem.squashfs
    
  6. Change permissions on /local.

    $ sudo chmod 755 -R /local
    
  7. PXE boot by restarting the system using ipmitool.

    $ ipmitool -I lanplus -H <DGX-BMC-IP> -U <username> -P <password> chassis bootdev pxe options=efiboot
    $ ipmitool -I lanplus -H <DGX-BMC-IP> -U <username> -P <password> chassis power reset
    

    When the system PXE menu appears, select the Firmware Update Container option. The firmware is updated automatically after the system has booted. If not set to update automatically, then follow the instructions to update the firmware.

  8. If the FPGA was updated, then perform a DC power cycle by issuing the following command.

    $ sudo ipmitool -I lanplus -H ${BMC_IP} -U ${BMC_USER} -P ${BMC_PW} chassis power cycle