GH200 Update Examples#

The firmware update mechanism for GH200 is different from the update mechanism for MGX in the following ways:

  • GH200 uses two fwpkg bundles.

    • The P3809 packages are for the BMC tray

    • The P4767 packages are for the GPU tray.

  • The update target files are passed with the -s option, which is always required to specify the update target (refer to the sample outputs in the next section).

  • To downgrade the GH200 firmware, set the ForceUpdate flag in the update target JSON file that is passed with the –s option.

Updating the GH200 BMC Tray#

To update the GH200 BMC tray:

  1. Use the package with the P3809 sub-string in the package name.

  2. Create a JSON file like the bmc_full.json file in the example.

  3. Pass it in update_fw command as the input.

Here is a full example:

$ cat bmc_full.json

{
    "Targets":[]
}

$ nvfwupd --target ip=<BMC IP> user=*** password=*** servertype=GH200 update_fw -s bmc_full.json -p nvfw_P3809_0001_240405.1.0_dbg-signed.fwpkg

Updating ip address: ip=XXXXFW package: ['nvfw_P3809_0001_240405.1.0_dbg-signed.fwpkg']
Ok to proceed with firmware update? <Y/N>
y
{"@odata.id": "/redfish/v1/TaskService/Tasks/3", "@odata.type": "#Task.v1_4_3.Task", "Id": "3", "TaskState": "Running", "TaskStatus": "OK"}
FW update started, Task Id: 3
Wait for Firmware Update to Start...
TaskState: Running
PercentComplete: 20
TaskStatus: OK
TaskState: Running
PercentComplete: 40
TaskStatus: OK
TaskState: Completed
PercentComplete: 100
TaskStatus: OK
Firmware update successful!
Overall Time Taken: 0:09:58
Refer to 'NVIDIA Firmware Update Document' on activation steps for new firmware to take effect.

------------------------------------------------------------------------------------------

Error Code: 0

Updating the GH200 HGX Tray#

To update the GH200 HGX tray:

  1. Use the package with the P4764 sub-string in the package name.

  2. Create a JSON file like the hgx_full.json file in the example.

  3. Pass it in the update_fw command as input like is shown in the example.

Here is an example:

$ cat hgx_full.json

{
    "Targets": ["/redfish/v1/Chassis/HGX_Chassis_0"]
}

$ nvfwupd --target ip=<BMC IP> user=*** password=*** servertype=GH200 update_fw -s hgx_full.json -p nvfw_P4764_0001_240405.1.0_dbg-signed.fwpkg

Updating ip address: ip=XXXX
FW package: ['nvfw_P4764_0001_240405.1.0_dbg-signed.fwpkg']
Ok to proceed with firmware update? <Y/N>
y
{"@odata.id": "/redfish/v1/TaskService/Tasks/HGX_3", "@odata.type": "#Task.v1_4_3.Task", "Id": "HGX_3", "TaskState": "Running", "TaskStatus": "OK"}
FW update started, Task Id: HGX_3
Wait for Firmware Update to Start...
TaskState: Running
PercentComplete: 20
TaskStatus: OK
TaskState: Running
PercentComplete: 40
TaskStatus: OK
TaskState: Completed
PercentComplete: 100
TaskStatus: OK
Firmware update successful!
Overall Time Taken: 0:09:50
Refer to 'NVIDIA Firmware Update Document' on activation steps for new firmware to take effect.
------------------------------------------------------------------------------------------
Error Code: 0

Updating the Firmware for Selected Components#

To perform a firmware update of a component, and identify the inventory name of the component, run the show_version command.

$ nvfwupd --target ip=<BMC IP> user=*** password=*** servertype=GH200 show_version -p nvfw_P3809_0001_240405.1.0_dbg-signed.fwpkg nvfw_P4764_0001_240405.1.0_dbg-signed.fwpkg

System Model: P4764-A01
Part number: 699-24764-0000-TS2
Serial number: 1580624630119
Packages: ['P3809_0001_240405.1.0', 'P4764_0001_240405.1.0']
Connection Status: Successful
Firmware Devices:
AP Name             Sys Version                 Pkg Version         Up-To-Date
-------             -----------                 -----------         ---------

FW_BMC_0            GH200-24.05-1               OberonGH-24.03-C    No
FW_CPLD_0           0.00                        N/A                 No
HGX_FW_FPGA_0       0.39                        0.3A                No
UEFI                buildbrain-gcid-36102704    N/A                 No
HGX_FW_BMC_0        GH200-24.05-1               24.03.C             Yes
HGX_FW_CPLD_0       0.15                        0.17                No
HGX_FW_CPU_0        02.00.02                    01.02.04            Yes
HGX_FW_CPU_1        02.00.02                    01.02.04            Yes
HGX_FW_ERoT_BMC_0   01.03.0136.0000_n01         01.03.0136.0000_n01 Yes
HGX_FW_ERoT_CPU_0   01.03.0136.0000_n01         01.03.0136.0000_n01 Yes
HGX_FW_ERoT_CPU_1   01.03.0136.0000_n01         01.03.0136.0000_n01 Yes
HGX_FW_ERoT_FPGA_0  01.03.0136.0000_n01         01.03.0136.0000_n01 Yes
HGX_FW_ERoT_GPU_0   01.03.0136.0000_n01         01.03.0136.0000_n01 Yes
HGX_FW_ERoT_GPU_1   01.03.0136.0000_n01         01.03.0136.0000_n01 Yes
HGX_FW_GPU_0        96.00.91.00.02              96.00.9F.00.03      No
HGX_FW_GPU_1        96.00.91.00.02              96.00.9F.00.03      No
HGX_InfoROM_GPU_0   G530.0225.00.02             N/A                 No
HGX_InfoROM_GPU_1   G530.0225.00.02             N/A                 No
-------------------------------------------------------------------------------------------

After identifying the inventory name, create the JSON file with the Redfish inventory URI of that component (/redfish/v1/UpdateService/FirmwareInventory/<component name>). The following example shows JSON sample for updating the HGX_FW_CPU_0 component

$ cat updparams.json
{
    "Targets":["/redfish/v1/UpdateService/FirmwareInventory/HGX_CPU_0"]
}

$ nvfwupd --target ip=<BMC IP> user=*** password=*** servertype=GH200 update_fw -s updparams.json -p nvfw_P4764_0001_240405.1.0_dbg-signed.fwpkg
Updating ip address: ip=XXXX
FW package: ['nvfw_P4764_0001_240405.1.0_dbg-signed.fwpkg']
Ok to proceed with firmware update? <Y/N>
y
{"@odata.id": "/redfish/v1/TaskService/Tasks/HGX_3", "@odata.type": "#Task.v1_4_3.Task", "Id": "HGX_3", "TaskState": "Running", "TaskStatus": "OK"}
FW update started, Task Id: HGX_3
Wait for Firmware Update to Start...
TaskState: Running
PercentComplete: 20
TaskStatus: OK
TaskState: Running
PercentComplete: 40
TaskStatus: OK
TaskState: Completed
PercentComplete: 100
TaskStatus: OK
Firmware update successful!
Overall Time Taken: 0:09:50

Refer to 'NVIDIA Firmware Update Document' on activation steps for new firmware to take effect.
------------------------------------------------------------------------------------------

Error Code: 0

Downgrading the Firmware in GH200 Using a Force Update#

To downgrade GH200 firmware, you must set the force update multipart option, and this option can be set in the update parameters file with targets and passed in the JSON file with the -s option. If you try firmware updates as described in the previous sections, and you see the following error message in the firmware update log:

Component comparison stamp is lower than the firmware component comparison stamp in the FD.

Retry with a force firmware update as shown in the following example.

Note

MGX platforms do not need force update because, by default, downgrades are allowed.

Here is an example:

$ cat updparams.json

{
    "ForceUpdate":true,
    "Targets":["/redfish/v1/UpdateService/FirmwareInventory/HGX_CPU_0"]
}

$ nvfwupd --target ip=<BMC IP> user=*** password=*** servertype=GH200 update_fw -s updparams.json -p nvfw_P4764_0001_240405.1.0_dbg-signed.fwpkg

Updating ip address: ip=XXXX
FW package: ['nvfw_P4764_0001_240405.1.0_dbg-signed.fwpkg']
Ok to proceed with firmware update? <Y/N>
y
{"@odata.id": "/redfish/v1/TaskService/Tasks/HGX_3", "@odata.type": "#Task.v1_4_3.Task", "Id": "HGX_3", "TaskState": "Running", "TaskStatus": "OK"}
FW update started, Task Id: HGX_3
Wait for Firmware Update to Start...
TaskState: Running
PercentComplete: 20
TaskStatus: OK
TaskState: Running
PercentComplete: 40
TaskStatus: OK
TaskState: Completed
PercentComplete: 100
TaskStatus: OK
Firmware update successful!
Overall Time Taken: 0:09:50
Refer to 'NVIDIA Firmware Update Document' on activation steps for new firmware to take effect.
------------------------------------------------------------------------------------------
Error Code: 0

Activating the Firmware#

After performing firmware update of a component, or a full bundle, complete an AC power cycle to activate the new firmware. It can take up to five minutes for the BMC and Redfish service to come up after the power cycle is complete. The activate_fw command returns immediately after issuing a command and does not wait for the BMC to come back up. To check the new system versions after the BMC Redfish service is back, run the show version command.