Redfish APIs Support

The DGX System firmware supports Redfish APIs. Redfish is DMTF’s standard set of APIs for managing and monitoring a platform. By default, Redfish support is enabled in the DGX H100 BMC and the BIOS. By using the Redfish interface, administrator-privileged users can browse physical resources at the chassis and system level through the REST API interface. Redfish provides information that is categorized under a specific resource endpoint and Redfish clients can use the end points by using following HTTP methods:

  • GET

  • POST

  • PATCH

  • PUT

  • DELETE

Not all endpoints support all these operations. Refer to the Redfish JSON Schema for more information about the operations. The Redfish server follows the DSP0266 1.7.0 Specification and Redfish Schema 2019.1 documentation. Redfish URIs are accessed by using basic authentication and implementation, so that IPMI users with required privilege can access the Redfish URIs.

Supported Redfish Features

Here is some information about the Redfish features that are supported in DGX H100.

The following features are supported:

  • Manage user accounts, privileges, and roles

  • Manager Sessions

  • BMC configuration

  • BIOS configuration

  • BIOS boot order management

  • Get PCIe device and functions inventory

  • Get storage Inventory

  • Get system component information and health (PSU, FAN, CPU, DIMM, and so on)

  • Get sensor information (Thermal/Power/Cooling)

  • BMC configuration change/BMC reset

  • System/Chassis power operations

  • Get health event log/advanced system event log

  • Logging Service, which provides critical/informational severity events

  • Event Services (SSE)

Refer to the following documentation for more information:

Connectivity Between the Host and BMC

You can configure internal network connectivity between the host and the BMC rather than using external network connectivity and routing traffic outside the host.

To configure internal network connectivity, you must configure an interface on the 169.254.0.0/255.255.0.0 network. The interface can then send and receive Redfish API traffic between the host and the BMC. The BMC is preconfigured to use the 169.254.0.17 IP address.

Run an ifconfig command like the following example to configure connectivity:

sudo ifconfig enx9638a3b292ec 169.254.0.18 netmask 255.255.0.0

Replace the network interface name and IP address in the preceding example according to your needs.

After you configure the network interface, you can use commands such as curl and nvfwupd with the 169.254.0.17 IP address to connect to the BMC and use the Redfish API.

The following example command shows the firmware versions:

nvfwupd -t ip=169.254.0.17 username=<bmc-user> password=<password> show_version

Redfish Examples

BMC Manager

  • Accounts

    The following curl command changes the password for the admin user.

    curl -k -u <bmc-user>:<password> --request PATCH 'https://<bmc-ip-address>/redfish/v1/AccountService/Accounts/1' --header 'If-Match: *'  --header 'Content-Type: application/json' --data-raw '{"Enabled" : true, "Password" : "DGXuser12345678!" , "UserName" : "admin" , "RoleId" : "Administrator" , "Locked" : false}'
    
  • Reset BMC

    The following curl command forces a reset of the DGX H100 BMC.

    curl -k -u <bmc-user>:<password> --request POST --location 'https://<bmc-ip-address>/redfish/v1/Managers/BMC/Actions/Manager.Reset'  --header 'Content-Type: application/json'  --data '{"ResetType":  "ForceRestart"}'
    
  • Reset BMC to factory defaults

    The following curl command resets the BMC to factory defaults.

    curl -k -u <bmc-user>:<password>  --request POST --location 'https://<bmc-ip-address>/redfish/v1/Managers/BMC/Actions/Manager.ResetToDefaults'  --header 'Content-Type: application/json'  --data '{"ResetType":"ResetAll"}'
    

Firmware Update

  • Firmware inventory

    curl -k -u <bmc-user>:<password> --request GET  'https://<bmc-ip-address>/redfish/v1/UpdateService/FirmwareInventory'
    

    Example Output

    {
        "@odata.context": "/redfish/v1/$metadata#SoftwareInventoryCollection.SoftwareInventoryCollection",
        "@odata.etag": "\"1683226281\"",
        "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory",
        "@odata.type": "#SoftwareInventoryCollection.SoftwareInventoryCollection",
        "Description": "Collection of Firmware Inventory resources available to the UpdateService",
        "Members": [
            {
                "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/CPLDMB_0"
            },
            {
                "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/CPLDMID_0"
            },
            // ...
        ],
        "Members@odata.count": 66,
        "Name": "Firmware Inventory Collection",
        "Oem": {
            "Ami": {
                "FirmwareInventory": [
                    {
                        "DataSourceUri": "/redfish/v1/UpdateService/FirmwareInventory/CPLDMB_0",
                        "Name": "CPLDMB_0",
                        "Version": "0.2.1.6"
                    },
                    {
                        "DataSourceUri": "/redfish/v1/UpdateService/FirmwareInventory/CPLDMID_0",
                        "Name": "CPLDMID_0",
                        "Version": "0.2.0.7"
                    },
                    // ...
                ]
            }
        }
    }
    
  • Update GPU tray components

    To update the GPU tray components in your DGX H100 system, you need to specify HGX_0 as the target regardless of the GPU tray component that you want to update.

    echo "{\"Targets\":[\"/redfish/v1/UpdateService/FirmwareInventory/HGX_0\"]}" > parameters.json
    curl -k -u <bmc-user>:<password> -H 'Expect:' --location --request POST https://<bmc-ip-address>/redfish/v1/UpdateService/upload -F 'UpdateParameters=@parameters.json;type=application/json' -F UpdateFile=@<fw_bundle>
    

    Make sure to specify the nvfw_DGX-HGX-H100x8_0002_xxxxxx.x.x_prod-signed.fwpkg firmware file.

  • Update motherboard tray components

    To update the motherboard tray components, you need to specify the component name as a target in a JSON file. The following example updates the host BMC:

    echo "{\"Targets\":[\"/redfish/v1/UpdateService/FirmwareInventory/HostBMC_0\"]}" > parameters.json
    curl -k -u <bmc-user>:<password> -H 'Expect:' --location --request POST https://<bmc-ip-address>/redfish/v1/UpdateService/upload -F 'UpdateParameters=@parameters.json;type=application/json' -F UpdateFile=@<fw_bundle>
    

    The following targets are available:

    • HostBMC_0 — This is the DGX H100 BMC.

    • HostBIOS_0 — This is the DGX H100 BIOS.

    • EROT_BMC_0 — This is the external root of trust for the host BMC.

    • EROT_BIOS_0 — This is the external root of trust for the host BIOS.

    • CPLDMID_0 — This is the midplane CPLD.

    • CPLDMB_0 — This is the CPU tray CPLD.

    • PSU_0 to PSU_5 — These are the PSUs.

    • PCIeSwitch_0 and PCIeSwitch_1 — These are the Gen5 PCIe switches on the CPU tray.

    • PCIeRetimer_0 and PCIeRetimer_1 — These are the PCIe retimers on the CPU tray.

    To update a target, change the path /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0 in the preceding example. For example, for CPU tray CPLD, specify /redfish/v1/UpdateService/FirmwareInventory/CPLDMB_0.

    Make sure to specify the nvfw_DGX-HGX-H100x8_0002_xxxxxx.x.x_prod-signed.fwpkg firmware file.

  • Forced Update

    The DGX H100 system components firmware is only updated if the incoming firmware version is newer than the existing version. To override this behavior and flash the component anyway, specify the ForceUpdate field and set it to true.

    curl -k -u <bmc-user>:<password> --request PATCH 'https://<bmc-ip-address>/redfish/v1/UpdateService' --header 'If-Match: *'  --header 'Content-Type: application/json' --data-raw '{"HttpPushUriOptions" : {"ForceUpdate": true}}'
    

    On success, the command returns a 204 HTTP status code. If you attempt to set the flag to the currently set value, the command returns a 400 HTTP status code.

    To get the value of the ForceUpdate parameter:

    curl -k -u <bmc-user>:<password> --request GET 'https://<bmc-ip-address>/redfish/v1/UpdateService'
    
  • Firmware Update Activation

    To activate the firmware update, refer to Firmware Update Activation in the NVIDIA DGX H100 Firmware Update Guide for more information.

BIOS Settings

  • Supported BIOS attributes

    1. Get a list of all the attributes your particular BIOS supports:

      curl -k -u <bmc-user>:<password> --location --request GET 'https://<bmc-ip-address>/redfish/v1/Registries'
      

      One of the Registries in the list is your BIOS attribute registry. The format is BiosAttributeRegistry<version><version>. For example, for BIOS 0.1.6, the registry is BiosAttributeRegistry106.1.0.6.

    2. Get the URI of the BIOS registry:

      curl -k -u <bmc-user>:<password> --location --request GET 'https://<bmc-ip-address>/redfish/v1/Registries/BiosAttributeRegistry016.0.1.6/'
      

      The response includes the location of the JSON file that describes all the BIOS attributes. Under Location, the Uri is specified. For example, Uri":"/redfish/v1/Registries/BiosAttributeRegistry106.1.0.6.

    3. Get the JSON file with the registry of all your BIOS attributes:

      curl -k -u <bmc-user>:<password> --location --request GET 'https://<bmc-ip-address>/redfish/v1/Registries/BiosAttributeRegistry106.en-US.1.0.6.json' --output BiosAttributeRegistry106.en-US.1.0.6.json
      

      Each attribute name has a default value, display name, help text, a read-only indicator, and an indicator of whether a reset is required to take effect.

To get the current value of all your attributes from the BIOS:

curl -k -u <bmc-user>:<password> --location --request GET 'https://<bmc-ip-address>/redfish/v1/Systems/DGX/Bios/SD'

Match the attribute name with the value in the registry for a description.

To change an attribute, PATCH the SD URI and specify the attribute name with the new value. Also, you can change more than one attribute at one time. For example, the following PATCH request specifies how the system responds when the SEL log is full:

curl -k -u <bmc-user>:<password> --location --request PATCH 'https://<bmc-ip-address>/redfish/v1/Systems/DGX/Bios/SD'   -H 'Content-Type: application/json' -H 'If-Match:*' --data-raw '{"Attributes" : {"IPMI002":"IPMI002DoNothing", "IPMI201":"IPMI201Donotloganymore"}}'

Modifying the Boot Order on DGX H100 Using Redfish

To modify the boot order on DGX H100 using Redfish APIs, follow the steps described in this procedure.

  1. Read the current boot order.

    From any system in the same network as the BMC, run the following curl command to get the current boot order:

    $ curl -k -u <BMC username>:<BMC password> https://<BMC_IP_address>/redfish/v1/Systems/DGX/SD -H "content-type:application/json" -X GET -s | jq .Boot.BootOrder
    
    [
      "Boot0000",
      "Boot000F",
      "Boot0004",
      "Boot0005",
      "Boot0006",
      "Boot0007",
      "Boot0008",
      "Boot0009",
      "Boot000A",
      "Boot0010"
     ]
    
  2. Identify the available boot devices.

    To show more information about the boot devices in step 1, such as Boot0000, Boot000F, and Boot0004, run the following command:

    $ curl -k -u <BMC username>:<BMC password> https://<BMC_IP_address>/redfish/v1/Systems/DGX/BootOptions/00{0,1}{0,4,5,6,7,8,9,A,F} -H "content-type:application/json" -X GET -s  | jq |grep -e "UefiDevicePath\|Name"
    
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "DGX OS",
    "Name": "Boot0000",
    "UefiDevicePath": "HD(1,GPT,159C2E52-2329-40AC-9103-6C28DC1528B8,0x800,0x100000)/\\EFI\\UBUNTU\\SHIMX64.EFI"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 Intel(R) Ethernet Controller X550",
    "Name": "Boot0004",
    "UefiDevicePath": "PciRoot(0x0)/Pci(0x10,0x0)/Pci(0x0,0x0)/MAC(5CFF35FBDA09,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 Nvidia Network Adapter - B8:3F:D2:E7:B1:6C",
    "Name": "Boot0005",
    "UefiDevicePath": "PciRoot(0x20)/Pci(0x1,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/MAC(B83FD2E7B16C,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 Nvidia Network Adapter - B8:3F:D2:E7:B1:6D",
    "Name": "Boot0006",
    "UefiDevicePath": "PciRoot(0x20)/Pci(0x1,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x1)/MAC(B83FD2E7B16D,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 Nvidia Network Adapter - B8:3F:D2:E7:B0:9C",
    "Name": "Boot0007",
    "UefiDevicePath": "PciRoot(0x120)/Pci(0x1,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/MAC(B83FD2E7B09C,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 Nvidia Network Adapter - B8:3F:D2:E7:B0:9D",
    "Name": "Boot0008",
    "UefiDevicePath": "PciRoot(0x120)/Pci(0x1,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x1)/MAC(B83FD2E7B09D,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 Intel(R) Ethernet Network Adapter E810-C-Q2",
    "Name": "Boot0009",
    "UefiDevicePath": "PciRoot(0x160)/Pci(0x5,0x0)/Pci(0x0,0x0)/MAC(6CFE543D8F48,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 Intel(R) Ethernet Network Adapter E810-C-Q2",
    "Name": "Boot000A",
    "UefiDevicePath": "PciRoot(0x160)/Pci(0x5,0x0)/Pci(0x0,0x1)/MAC(6CFE543D8F49,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "ubuntu",
    "Name": "Boot000F",
    "UefiDevicePath": "HD(1,GPT,1E0EFF2A-2BF3-4DC6-8757-4075B1E5343D,0x800,0x100000)/\\EFI\\UBUNTU\\SHIMX64.EFI"
    "@odata.etag": "\"1696896625\"",
    "DisplayName": "UEFI: PXE IPv4 American Megatrends Inc.",
    "Name": "Boot0010",
    "UefiDevicePath": "PciRoot(0x0)/Pci(0x14,0x0)/USB(0xA,0x0)/USB(0x2,0x1)/MAC(4E2A712C2451,0x0)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)"
    

    Where

    • The DisplayName string is the name of the drive or network adapter.

    • The Name string is the boot device name.

    • The MAC(<address>,0x1) value for the UefiDevicePath string is the corresponding MAC address.

    • The @odata.etag string is the etag number.

    Identify the following information from the JSON output for the next step:

    • The name of the device to be the boot device.

    • The etag number to compose the header.

  3. Update the boot order.

    The following command uses the PATCH method to modify the BootOrder settings, specifying the etag number and boot device names from step 2. The command generates a new order list for BootOrder, which affects the next boot of the system.

    $ curl -k -u <BMC username>:<BMC password> https://<BMC_IP_address>/redfish/v1/Systems/DGX/SD -H "content-type:application/json" -H 'if-None-Match: "@odata.etag": "1697483651"' --data '{"Boot":{"BootOrder": ["Boot0004", "Boot0000", "Boot0005", "Boot0006", "Boot0007", "Boot0008", "Boot0009", "Boot000A", "Boot000F", "Boot0010"]}}' -X PATCH
    
  4. Confirm the boot order.

    Repeat the command in step 1 to ensure the BootOrder settings are as expected. Note that the Boot0004 boot device is now at the top and the system will boot from the on-board RJ-45 network interface.

    $ curl -k -u <BMC username>:<BMC password> https://<BMC_IP_address>/redfish/v1/Systems/DGX/SD -H "content-type:application/json" -X GET -s | jq .Boot.BootOrder
    
    [
      "Boot0004",
      "Boot0000",
      "Boot0005",
      "Boot0006",
      "Boot0007",
      "Boot0008",
      "Boot0009",
      "Boot000A",
      "Boot000F",
      "Boot0010"
    ]
    

    Upon reboot, the system should attempt to boot from the network using the correct network interface:

    _images/dgx-h100-boot-order.png

This boot order change will remain until the next boot order update, which can be done by resetting the SBIOS or running this procedure again.

Telemetry

  • GPU tray sensors

    curl -k -u <bmc-user>:<password> --location --request GET 'https://<bmc-ip-address>/redfish/v1/TelemetryService/MetricReportDefinitions/HGX_PlatformEnvironmentMetrics_0'
    
  • DGX platform sensors

    curl -k -u <bmc-user>:<password> --location --request GET 'https://<bmc-ip-address>/redfish/v1/Chassis/DGX/Sensors'
    

    The endpoint returns 75 members at a time. To page through the results, use the URI in the Members@odata.nextLink field. For example, /redfish/v1/Chassis/DGX/Sensors?$skip=75.

Chassis

  • Chassis Restart (IPMI chassis power cycle)

    curl -k -u <bmc-user>:<password> --request POST --location 'https://<bmc-ip-address>/redfish/v1/Systems/DGX/Actions/ComputerSystem.Reset'  --header 'Content-Type: application/json'  --data '{"ResetType":  "ForceRestart"}'
    
  • Chassis Start (IPMI chassis power on)

    curl -k -u <bmc-user>:<password> --request POST --location 'https://<bmc-ip-address>/redfish/v1/Systems/DGX/Actions/ComputerSystem.Reset'  --header 'Content-Type: application/json'  --data '{"ResetType":  "On"}'
    
  • Chassis Off (IPMI chassis power off)

    curl -k -u <bmc-user>:<password> --request POST --location 'https://<bmc-ip-address>/redfish/v1/Systems/DGX/Actions/ComputerSystem.Reset'  --header 'Content-Type: application/json'  --data '{"ResetType":  "ForceOff"}'
    
  • Chassis Off Gracefully (IPMI chassis soft)

    curl -k -u <bmc-user>:<password> --request POST --location 'https://<bmc-ip-address>/redfish/v1/Systems/DGX/Actions/ComputerSystem.Reset'  --header 'Content-Type: application/json'  --data '{"ResetType":  "GracefulShutdown"}'
    

SEL Logs

To view all the SEL entries using redfish:

curl -k -u <bmc-user>:<password> --location --request GET 'https://<bmc-ip-address>/redfish/v1/Managers/BMC/LogServices/SEL/Entries'

The endpoint returns 75 members at a time. To page through the results, use the URI in the Members@odata.nextLink field. For example, /redfish/v1/Managers/BMC/LogServices/SEL/Entries?$skip=75.

Virtual Image

  1. Make sure Virtual Media is enabled:

    curl -k -u <bmc-user>:<password> --request POST --location 'https://<bmc-ip-address>/redfish/v1/Managers/BMC/Actions/Oem/AMIVirtualMedia.EnableRMedia' --data-raw '{"RMediaState": "Enable"}'
    
  2. Mount the media:

    curl -k -u <bmc-user>:<password> --request POST --location 'https://{{bmc-ip-address}}/redfish/v1/Managers/Self/VirtualMedia/CD_1/Actions/VirtualMedia.InsertMedia' --data-raw '{"Image" : "//<serverip>/home/nvidia/images/ubuntu-20.04.2-live-server-amd64.iso","TransferProtocolType" : "NFS"}'
    

Collect BMC Debug Data

  1. Create a request for BMC to start collecting debug data:

    curl -k -u <bmc-user>:<password> --request POST --location 'https://<bmc-ip-address>/redfish/v1/Managers/BMC/LogServices/DiagnosticLog/Actions/LogService.CollectDiagnosticData' -H 'Content-Type: application/json' --data-raw '{"DiagnosticDataType" : "OEM"}'
    

    Example Output

    {
      "@odata.context": "/redfish/v1/$metadata#Task.Task",
      "@odata.id": "/redfish/v1/TaskService/Tasks/1",
      "@odata.type": "#Task.v1_4_2.Task",
      "Description": "Task for Manager CollectDiagnosticData",
      "Id": "1",
      "Name": "Manager CollectDiagnosticData",
      "TaskState": "New"
    }
    
  2. Monitor the task returned until it completes. Change task number as appropriate:

    curl -k -u <bmc-user>:<password> --request GET 'https://<bmc-ip-address>/redfish/v1/TaskService/Tasks/1'
    
  3. After the task stats reports Complete, download the attachments:

    curl -k -u <bmc-user>:<password> --request GET 'https://<bmc-ip-address>/redfish/v1/Managers/BMC/LogServices/DiagnosticLog/Entries/All/Attachment' --output debugBMC.tgz
    

Clear BIOS and Reset to Factory Defaults

To clear the BIOS and reset the system to factory defaults:

echo "{\"Targets\":[\"/redfish/v1/UpdateService/FirmwareInventory/HostBIOS_0\"]}" > parameters.json
curl -k -u <bmc-user>:<password> -H 'Expect:' --location --request POST https://<bmc-ip-address>/redfish/v1/UpdateService/Actions/Oem/NvidiaUpdateService.ClearNVRAM -F 'UpdateParameters=@parameters.json;type=application/json'