BlueField-3 Administrator Quick Start Guide

NVIDIA BlueField-3 Networking Platform User Guide

This page is tailored for system administrators wishing to install BlueField and perform sample administrative actions on it. For a quick start guide aimed at software developers wishing to develop applications on the BlueField card using the DOCA framework, please refer to the NVIDIA DOCA Developer Quick Start Guide.

Note

Not sure which guide to follow? For more details on the different BlueField user types, please refer to the NVIDIA BlueField and DOCA User Types document.

  • DPU BMC 1GbE interface connected to the management network via ToR

  • Remote Management Controller (RMC) connected to DPU BMC 1GbE via ToR

    Note

    RMC is the platform for data center infrastructure managers to manage DPUs.

  • DHCP server existing in the management network

  • An NVQual certified server

network-elements-version-1-modificationdate-1701143027753-api-v2.png

Content:

The following illustrates the sequence of events and actions from first time power-up of the NVIDIA® BlueField® DPU in the data center environment through provisioning and maintenance.

Note

The numbers indicated in the sequence diagram map to the steps below the diagram.

dpu-provisioning-flow-version-2-modificationdate-1701146125417-api-v2.png

At the end of this procedure, the DPU should be configured with an IP address, all required settings, has up-to-date software component versions, and is ready to use.

The DPU SoC boots to DPU UEFI BIOS and DHCP DISCOVER is sent

  1. DPU SoC runs UEFI/PXE which sends a DHCP DISCOVER over the 1GbE OOB interface, including vendor class ("NVIDIA/BF/PXE") for DPU SoC (to allow customer's server to differentiate between DPU SoC and DPU BMC), and MAC for identification and discovery (see Appendix A for more information).

  2. A customer's DHCP server inspects the MAC address and the vendor class, allocates IP, and continues the standard DHCP.

  3. DHCP server updates RMC of the new DPU discovered with detailed information (e.g., MAC, IP address, vendor class).

DPU BMC issues DHCP DISCOVER over the 1GbE OOB interface, including vendor class ("NVIDIA/BF/BMC") for DPU-BMC, and MAC for identification and discovery. Example of DPU BMC DHCP DISCOVER packet structure (note "NVIDIA/BF/BMC" in line 13):

Copy
Copied!
            

root@dpu-bmc:~# 18:18:10.563269 IP (tos 0xc0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 320) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from b8:3f:d2:ca:4b:26 (oui Unknown), length 292, xid 0xfc2acdec, secs 1, Flags [none] (0x0000) Client-Ethernet-Address b8:3f:d2:ab:cd:ef (oui Unknown) Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message (53), length 1: Discover Client-ID (61), length 7: ether b8:3f:d2:ab:cd:ef Parameter-Request (55), length 9: Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6), Hostname (12) Domain-Name (15), Static-Route (33), NTP (42), Unknown (120) Classless-Static-Route (121) MSZ (57), length 2: 576 Hostname (12), length 7: "dpu-bmc" Vendor-Class (60), length 13: "NVIDIA/BF/BMC" END (255), length 0 18:18:10.565261 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto UDP (17), length 353) (example) dhcp01.XX.YY > ldev-platform-13-043-bmc.bootpc: [no cksum] BOOTP/DHCP, Reply, length 325, hops 1, xid 0xfc2acdec, secs 1, Flags [none] (0x0000) (example) Your-IP ldev-platform-13-043-bmc.XX.YY (example) Server-IP l-pxe02.XX.YY Gateway-IP 10.237.0.255 Client-Ethernet-Address b8:3f:d2:ab:cd:ef (oui Unknown) file "pxelinux.0" Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message (53), length 1: Offer Server-ID (54), length 4: (example) dhcp01.XX.YY Lease-Time (51), length 4: 43200 Subnet-Mask (1), length 4: 255.255.0.0 Default-Gateway (3), length 4 (example) GW.XX.YY Hostname (12), length 24: "ldev-platform-13-043-bmc" Domain-Name (15), length 13: "<local domain name>" NTP (42), length 4: (example) NTP.XX.YY END (255), length 0 18:18:10.565261 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto UDP (17), length 353) dhcp01.XX.YY > ldev-platform-13-043-bmc.<local domain name>: [no cksum] BOOTP/DH

  1. DHCP server inspects the MAC address and the vendor class, allocates IP and continues the standard DHCP flow.

  2. DHCP server updates RMC of the new DPU BMC discovered with detailed information: MAC, IP address, vendor classes, etc.

To communicate with the DPU BMC, change the default password (0penBmc) by sending the following Redfish schema to the DPU BMC:

Copy
Copied!
            

curl -k -u root:0penBmc -H "Content-Type: application/json" -X PATCH https://<DPU-BMC-IP>/redfish/v1/AccountService/Accounts/root -d '{"Password" : "<user-password>"}'

Where <DPU-BMC-IP> is the IP address for the DPU BMC (e.g., 10.10.1.2), and <user-password> is the chosen password to log into the DPU BMC with root privileges.

The new password must comply with the following policy parameters:

  • Minimum length: 13

  • Maximum length: 20

  • Minimum number of upper-case characters: 1

  • Minimum number of lower-case characters: 1

  • Minimum number of digits: 1

  • Minimum number of special characters: 1

    Note

    List of special characters:

    • $ (dollar sign)

    • % (percent sign)

    • ^ (caret/circumflex)

    • & (ampersand)

    • * (asterisk)

    • - (minus)

    • + (plus)

    • = (equal)

    • | (pipe)

    • ~ (tilde)

    • _ (underscore)

    • , (comma)

    • . (period/full stop)

    • ; (semicolon)

    • : (colon)

    • " (quotation mark)

    • ' (apostrophe)

    • / (forward slash)

    • \ (backslash)

  • Maximum number of consecutive character pairs: 4

    Note

    Two characters are consecutive if |hex(char_1)-hex(char_2)|=1.

    Examples of passwords with 5 consecutive character pairs (invalid): DcBa123456AbCd!; ab1XbcYcdZdeGef!; Testing_123abcgh!.

The following is a valid example password:

  • HelloNvidia3D!

Warning

The root account locks after four consecutive failed attempts and automatically unlocks after 10 minutes.

For example:

Copy
Copied!
            

[redfish_scripts] $ curl -k -u root:0penBmc -H "Content-Type: application/json" -X PATCH https://<DPU-BMC-IP>/redfish/v1/AccountService/Accounts/root -d '{"Password" : "HelloNvidia3D!"}' Response: {  "@Message.ExtendedInfo": [    {      "@odata.type": "#Message.v1_1_1.Message",      "Message": "The request completed successfully.",      "MessageArgs": [],      "MessageId": "Base.1.15.0.Success",      "MessageSeverity": "OK",     "Resolution": "None"    } ] }

Upgrade DPU BMC firmware via the Redfish "update service schema" through the 1GbE OOB.

  • If a BlueField-2 DPU is in your possession, follow the instructions in Appendix A

  • If a BlueField-3 DPU is in your possession, follow the steps in the following subsections

Note

Make sure to download the latest DPU BMC image available from the BlueField Runtime and Driver Downloader.

Update BMC Firmware

  1. Run the following Redfish command over the 1GbE out-of-band interface on the DPU BMC to trigger a secure DPU BMC firmware update:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -H "Content-Type: application/octet-stream" -X POST -T <package_path> https://<DPU-BMC-IP>/redfish/v1/UpdateService/update

    Where:

    • <password> – DPU BMC password

    • <package_path> – BMC firmware update package path pointing to BMC *.fwpkg binary (e.g., bf3-bmc-23.09-6_opn.fwpkg)

    • <DPU-BMC-IP> – BMC IP address

      After pushing the image to the DPU BMC, a new task is created. Example:

      Copy
      Copied!
                  

      { "@odata.id": "/redfish/v1/TaskService/Tasks/0", "@odata.type": "#Task.v1_4_3.Task", "Id": "0", "TaskState": "Running" }

      Note

      BMC firmware update takes ~12 minutes.

  2. To track the progress of the update, use the task Id received in the response above (i.e., 0) in your query and monitor the value of the task’s PercentComplete field:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -X GET https://<DPU-BMC-IP>/redfish/v1/TaskService/Tasks/<task_id> | jq -r ' .PercentComplete'

    Where:

    • <password> – DPU BMC password

    • <DPU-BMC-IP> – BMC IP address

    • <task_id> – task ID of the update process as received in the response under the Id value

      Example output:

      Copy
      Copied!
                  

      % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2123 100 2123 0 0 38600 0 --:--:-- --:--:-- --:--:-- 37910 20

      See PercentComplete is at 20 percent.

  3. Proceed to the next step when the process reaches 100%.

Update eROT Firmware

  1. Trigger a secure firmware update:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -H "Content-Type: application/octet-stream" -X POST -T <package_path> https://<DPU-BMC-IP>/redfish/v1/UpdateService/update

    Where:

    • <password> – DPU BMC password

    • <package_path> – eROT firmware update package path pointing to eROT *.fwpkg binary (e.g. cec1736-ecfw-00.02.0127.0000-n02-rel-prod.fwpkg)

    • <DPU-BMC-IP> – BMC IP address

      After initiating the eROT secure update, a new task is created. Example:

      Copy
      Copied!
                  

      { "@odata.id": "/redfish/v1/TaskService/Tasks/0", "@odata.type": "#Task.v1_4_3.Task", "Id": "0", "TaskState": "Running" }

      Note

      eROT firmware update takes ~20 seconds.

  2. To track the progress of the update, use the task Id received in the response above (i.e., 0) in your query and monitor the value of the task’s PercentComplete field:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -X GET https://<DPU-BMC-IP>/redfish/v1/TaskService/Tasks/<task_id> | jq -r ' .PercentComplete'

    Warning

    Run this command several times until PercentComplete shows 100 before proceeding to other operations.

    Where:

    • <password> – DPU BMC password

    • <DPU-BMC-IP> – BMC IP address

    • <task_id> – task ID of the update process as received in the response under the Id value

Warning

For the firmware of the BMC and CEC to apply and to allow new Redfish APIs which are required for the following steps, a power cycle of the DPU is required. The BlueField-3 DPU is installed in the host's PCIe slot. To initiate the power cycle sequence for the DPU, the entire server on which it is installed must be power cycled.

Possible Error Codes During BMC/eROT Upgrade

Fault

Diagnosis and Possible Solution

Connection to BMC breaks during firmware package transfer

  • Redfish task URI is not returned by the Redfish server

  • The Redfish server (if operational) is in idle state

  • After a reboot of BMC, or restart/recovery of the Redfish server, the Redfish server is in idle state

A new firmware update can be attempted by the Redfish client.

Connection to BMC breaks during firmware update

  • Redfish task URI previously returned by the Redfish server is no longer accessible

  • The Redfish server (if operational) is in one of the following states:

    • In idle state, if the firmware update has completed

    • In update state, if the firmware update is still ongoing

  • After a BMC reboot, or the restart/recovery of the Redfish server, the Redfish server is in idle state

A new firmware update can be attempted by the Redfish client.

Two firmware update requests are initiated

The Redfish server blocks the second firmware update request and returns the following:

  • HTTP code 400 "Bad Request"

  • Redfish message based on standard registry entry UpdateInProgress

Check the status of the ongoing firmware update by looking at the TaskCollection resource.

Redfish task hangs

  • Redfish task URI that previously returned by the Redfish server is no longer accessible

  • PLDM-based firmware update progresses

  • After a reboot of BMC, or restart/recovery of the Redfish server, the Redfish server us in idle state

A new firmware update can be attempted by the Redfish client.

BMC-EROT communication failure during image transfer

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry based on the standard registry Update.1.0.0.TransferFailed indicating the components that failed during image transfer

The Redfish client may retry the firmware update.

Firmware update fails

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry describing the error

The Redfish client may retry the firmware update.

ERoT failure (not responding)

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Canceled

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry describing the error

  • The Redfish client reports the error

The Redfish client may retry the firmware update.

Firmware image validation failure

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry based on the standard registry Update.1.0.0.VerificationFailed to indicate the component for which verification failed

  • The Redfish client reports the error

The Redfish client might retry the firmware update.

Power loss before activation command is sent

  • The Redfish server is in idle state

A new firmware update can be attempted by the Redfish client.

Firmware activation failure

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry based on the standard registry Update.1.0.ActivateFailed

The Redfish client may retry the firmware update.

Push to BMC firmware package greater than 200 MB

  • No Redfish task is created

  • Messages array in the task includes an entry based on the standard registry Base.1.8.1.ResourceExhaustion and a request to retry the operation is given

Upgrade the DPU firmware components (i.e., ATF, UEFI, NIC-firmware) and the BSP using the BFB image.

Note

Make sure to download the latest DOCA image (BFB file) available from the BlueField Runtime and Driver Downloader.

The BFB installation procedure consists of the following main stages:

  1. Enabling RShim on the BMC. See section "Enable RShim on DPU BMC" for instructions.

  2. Initiating the BFB update procedure by transferring the BFB image using one of the following options:

    • Direct SCP

      1. Running an SCP command.

    • Redfish interface

      1. Confirming the identity of the host and BMC—required only during first-time setup or after BMC factory reset.

      2. Sending a Simple-Update request.

Transferring BFB Image

Since the BFB is too large to store on the BMC flash or tmpfs, the image must be written to the RShim device. This can be done by either running SCP directly or using the Redfish interface.

Redfish Interface

The following is a simple sequence diagram illustrating the flow of the BFB installation process.

redfish-transferring-bfb-image-version-1-modificationdate-1700495197953-api-v2.png

The following are detailed instructions outlining each step in the diagram:

  1. Confirm the identity of the remote server (i.e., host holding the BFB image) and BMC.

    Note

    Required only during first-time setup or after BMC factory reset.

    1. Run the following on the remote server:

      Copy
      Copied!
                  

      ssh-keyscan -t <key_type> <remote_server_ip>

      Where:

      • key_type – the type of key associated with the server storing the BFB file (e.g., ed25519)

      • remote_server_ip – the IP address of the server hosting the BFB file

    2. Retrieve the public key of the host holding the BFB image from the response and provide the remote server's credentials to the DPU using the following command:

      Copy
      Copied!
                  

      curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"RemoteServerIP":"<remote_server_ip>", "RemoteServerKeyString":"<remote_server_public_key>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/Oem/NvidiaUpdateService.PublicKeyExchange

      Where:

      • remote_server_ip – the IP address of the server hosting the BFB file

      • remote_server_public_key – remote server's public key from the ssh-keyscan response, which contains both the type and the public key with a space between the two fields (i.e., "<type> <public_key>").

      • bmc_ip – BMC IP address

    3. Extract the BMC public key information (i.e., "<type> <bmc_public_key> <username>@<hostname>") from the PublicKeyExchange response and append it to the authorized_keys file on the host holding the BFB image. This enables passwordless key-based authentication for users.

      Copy
      Copied!
                  

      { "@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "Please add the following public key info to ~/.ssh/authorized_keys on the remote server", "MessageArgs": [ "<type> <bmc_public_key> root@dpu-bmc" ] }, { "@odata.type": "#Message.v1_1_1.Message", "Message": "The request completed successfully.", "MessageArgs": [], "MessageId": "Base.1.15.0.Success", "MessageSeverity": "OK", "Resolution": "None" } ] }

    4. If the remote server public key must be revoked, use the following command before repeating the previous step:

      Copy
      Copied!
                  

      curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"RemoteServerIP":"<remote_server_ip>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/Oem/NvidiaUpdateService.RevokeAllRemoteServerPublicKeys

      Where:

      • remote_server_ip – remote server's IP address

      • bmc_ip – BMC IP address

  2. Start BFB image transfer using the following command on the remote server:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"TransferProtocol":"SCP", "ImageURI":"<image_uri>","Targets":["redfish/v1/UpdateService/FirmwareInventory/DPU_OS"], "Username":"<username>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/UpdateService.SimpleUpdate

    Note

    After the BMC boots, it may take a few seconds (6-8 in NVIDIA® BlueField®-2, and 2 in BlueField-3) until the DPU BSP (DPU_OS) is up.

    Warning

    This command uses SCP for the image transfer, initiates a soft reset on the BlueField and then pushes the boot stream. For Ubuntu BFBs, the eMMC is flashed automatically once the bootstream is pushed. On success, a "running" message is received with the current task ID.

    Where:

    • image_uri – the image URI format should be <remote_server_ip>/<path_to_bfb>

    • username – username on the remote server

    • bmc_ip – BMC IP address

      Examples:

      • If RShim is disabled:

        Copy
        Copied!
                    

        {  "error": {    "@Message.ExtendedInfo": [ {        "@odata.type": "#Message.v1_1_1.Message",        "Message": "The requested resource of type Target named '/dev/rshim0/boot' was not found.",        "MessageArgs": [          "Target",          "/dev/rshim0/boot" ],        "MessageId": "Base.1.15.0.ResourceNotFound",        "MessageSeverity": "Critical",        "Resolution": "Provide a valid resource identifier and resubmit the request." } ],    "code": "Base.1.15.0.ResourceNotFound",    "message": "The requested resource of type Target named '/dev/rshim0/boot' was not found." }

      • If a username or any other required field is missing:

        Copy
        Copied!
                    

        { "Username@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "The create operation failed because the required property Username was missing from the request.", "MessageArgs": [ "Username" ], "MessageId": "Base.1.15.0.CreateFailedMissingReqProperties", "MessageSeverity": "Critical", "Resolution": "Correct the body to include the required property with a valid value and resubmit the request if the operation failed." } ] }

      • If the request is valid and a task is created:

        Copy
        Copied!
                    

        { "@odata.id": "/redfish/v1/TaskService/Tasks/<task_id>", "@odata.type": "#Task.v1_4_3.Task", "Id": "<task_id>", "TaskState": "Running", "TaskStatus": "OK" }

  3. Wait 2 seconds and run the following on the host to track image transfer progress:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/<task_id>

    Warning

    The transfer takes ~8 minutes for BlueField-3, and ~40 minutes for BlueField-2. During the transfer, the PercentComplete value remains at 0. If no errors occur, the TaskState is set to Running, and a keep-alive message is generated every 5 minutes with the content "Transfer is still in progress (X minutes elapsed). Please wait". Once the transfer is completed, the PercentComplete is set to 100, and the TaskState is updated to Completed.

    Upon failure, a message is generated with the relevant resolution.

    Where:

    1. bmc_ip – BMC IP address

    2. task_id – task ID

      Troubleshooting:

      • If host identity is not confirmed or the provided host key is wrong:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>, "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": " Unknown Host: Please provide server's public key using PublicKeyExchange ", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"

        Note

        In this case, revoke the remote server key (step 1.d.), and repeat steps 1.a. to 1.c.

      • If the BMC identity is not confirmed:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": "Unauthorized Client: Please use the PublicKeyExchange action to receive the system's public key and add it as an authorized key on the remote server", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"

        Note

        In this case, verify that the BMC key has been added correctly to the authorized_key file on the remote server.

      • If SCP fails:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": "Failed to launch SCP", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"

      • The keep-alive message:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": " <file_name>' is being transferred to '/dev/rshim0/boot'.", "MessageArgs": [ " <file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferringToComponent", "Resolution": "Transfer is still in progress (5 minutes elapsed): Please wait", "Severity": "OK" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Running", "TaskStatus": "OK"

      • Upon completion of transfer of the BFB image to the DPU, the following is received:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Device 'DPU' successfully updated with image '<file_name>'.", "MessageArgs": [ "DPU", "<file_name>" ], "MessageId": "Update.1.0.UpdateSuccessful", "Resolution": "None", "Severity": "OK" }, … "PercentComplete": 100, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Completed", "TaskStatus": "OK"

  4. When the BFB transfer is complete, dump the current RShim miscellaneous messages to check the update status.

    Note

    Refer to section "BMC Dump Operations" under "BMC and BlueField Logs" for information on dumping the rshim.log which contains the current RShim miscellaneous messages.

  5. Verify that the new BFB is running by checking its version:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/DPU_OS

Direct SCP

Copy
Copied!
            

scp <path_to_bfb> root@<bmc_ip>:/dev/rshim0/boot

Verify BlueField BSP, BlueField BMC and BlueField NIC firmware versions are up to date according to the NVIDIA BlueField BMC Software User Manual and NVIDIA BlueField DPU BSP Release Notes.

  1. Use the Redfish FirmwareInventory schema over the 1GbE OOB interface to the DPU's BMC:

    Copy
    Copied!
                

    [redfish_scripts] $ curl -k -u root:<password> -H "Content-Type: application/octet-stream" -X GET https://<DPU-BMC-IP>/redfish/v1/UpdateService/FirmwareInventory { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory", "@odata.type": "#SoftwareInventoryCollection.SoftwareInventoryCollection", "Members": [ { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/9f7ec75a_BMC_Firmware" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/Bluefield_FW_ERoT" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_ATF" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_BOARD" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_BSP" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_NIC" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_NODE" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_OFED" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_OS" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_SYS_IMAGE" }, { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_UEFI" } ], "Members@odata.count": 11, "Name": "Software Inventory Collection" }

    Response example for DPU_ATF:

    Copy
    Copied!
                

    > curl -k -u root:<password> -H "Content-Type: application/octet-stream" -X GET https://<DPU-BMC-IP>/redfish/v1/UpdateService/FirmwareInventory/DPU_ATF { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/DPU_ATF", "@odata.type": "#SoftwareInventory.v1_4_0.SoftwareInventory", "Description": "Host image", "Id": "DPU_ATF", "Members@odata.count": 1, "Name": " "Software Inventory", "RelatedItem": [ { "@odata.id": "/redfish/v1/Systems/Bluefield/Bios" } ], "SoftwareId": "", "Status": { "Health": "OK", "HealthRollup": "OK", "State": "OK", }, "Updateable": true, "Version": "v2.2(release):4.0.2-33-gd9f4ad5"

    Note

    This request may also be used to query some of the other previously mentioned components (e.g., 9f7ec75a_BMC_Firmware, Bluefield_FW_ERoT).

  2. If the versions are not as expected, upgrade as needed:

    1. Download the latest DOCA (BFB file) versions from the downloader at the bottom of the DOCA product page.

    2. DOCA (BFB) upgrade options (upgrading UEFI, ATF, Arm OS, NIC firmware components):

      • Recommended—BFB upgrade from remote management controller using Redfish UpdateService schema over 1GbE to DPU BMC:

        Copy
        Copied!
                    

        export token=`curl -k -H "Content-Type: application/json" -X POST https://<bmc_ip>/login -d '{"username":"root", "password":"<password>"}' | grep token | awk '{print $2;}' | tr -d '"'`

        For more information on deploying BlueField software from the BMC, refer to the "Deploying BlueField Software Using BFB from BMC" page of the NVIDIA BlueField DPU BSP document.

  1. Get the DPU's BMC MAC address using the following Redfish command over the 1GbE OOB port to the DPU BMC:

    Copy
    Copied!
                

    curl -k -u root:<password> -H 'Content-Type: application/json' -X GET https://<DPU-BMC-IP>/redfish/v1/Managers/Bluefield_BMC/EthernetInterfaces/eth0 { "@odata.id": "/redfish/v1/Managers/Bluefield_BMC/EthernetInterfaces/eth0", "@odata.type": "#EthernetInterface.v1_6_0.EthernetInterface", "DHCPv4": { "DHCPEnabled": true, "UseDNSServers": true, "UseDomainName": true, "UseNTPServers": true }, "DHCPv6": { "OperatingMode": "Stateful", "UseDNSServers": true, "UseDomainName": true, "UseNTPServers": true }, "Description": "Management Network Interface", "FQDN": "dpu-bmc", "HostName": "dpu-bmc", "IPv4Addresses": [ { "Address": "10.237.40.179", "AddressOrigin": "DHCP", "Gateway": "0.0.0.0", "SubnetMask": "255.255.0.0" } ], "IPv4StaticAddresses": [], "IPv6AddressPolicyTable": [], "IPv6Addresses": [ { "Address": "fdfd:fdfd:10:237:966d:aeff:fe17:9f5f", "AddressOrigin": "DHCPv6", "AddressState": null, "PrefixLength": 64 }, { "Address": "fe80::966d:aeff:fe17:9f5f", "AddressOrigin": "LinkLocal", "AddressState": null, "PrefixLength": 64 } ], "IPv6DefaultGateway": "fe80::445b:ed80:5f97:8900", "IPv6StaticAddresses": [], "Id": "eth0", "InterfaceEnabled": true, "LinkStatus": "LinkUp", "MACAddress": "94:6d:ae:17:9f:5f", "MTUSize": 1500, "Name": "Manager Ethernet Interface", "NameServers": [ "fdfd:fdfd:7:77:250:56ff:fe8b:e4f9" ], "SpeedMbps": 0, "StaticNameServers": [], "Status": { "Health": "OK", "HealthRollup": "OK", "State": "Enabled" }, "VLANs": { "@odata.id": "/redfish/v1/Managers/Bluefield_BMC/EthernetInterfaces/eth0/VLANs" } }

  2. Get the DPU's high-speed port's MAC addresses using the following Redfish command over the 1GbE OOB port to the DPU BMC:

    Copy
    Copied!
                

    curl -k -u root:<password> -H "Content-Type: application/octet-stream" -X GET https://<bmc_ip>/redfish/v1/Chassis/Card1/NetworkAdapters/NvidiaNetworkAdapter/NetworkDeviceFunctions/eth0f0 { "@odata.id": "/redfish/v1/Chassis/Card1/NetworkAdapters/NvidiaNetworkAdapter/NetworkDeviceFunctions/eth0f0", "@odata.type": "#NetworkDeviceFunction.v1_9_0.NetworkDeviceFunction", "Ethernet": { "MACAddress": "02:b1:b6:12:39:05", "MTUSize": 1500 }, "Id": "eth0f0", "Links": { "OffloadSystem": { "@odata.id": "/redfish/v1/Systems/Bluefield" }, "PhysicalPortAssignment": { "@odata.id": "/redfish/v1/Chassis/Card1/NetworkAdapters/NvidiaNetworkAdapter/Ports/eth0" } }, "Name": "NetworkDeviceFunction", "NetDevFuncCapabilities": [ "Ethernet" ], "NetDevFuncType": "Ethernet" }

To change from DPU mode to NIC mode (or vice versa):

  1. To enable NIC mode:

    Copy
    Copied!
                

    curl -k -u root:<password> -H 'content-type: application/json' -d '{ "Attributes": { "NicMode": "NicMode" } }' -X PATCH https://<DPU-BMC-IP>/redfish/v1/Systems/Bluefield/Bios/Settings

  2. To disable NIC mode:

    Copy
    Copied!
                

    curl -k -u root:<password> -H 'content-type: application/json' -d '{ "Attributes": { "NicMode": "DpuMode" } }' -X PATCH https://<DPU-BMC-IP>/redfish/v1/Systems/Bluefield/Bios/Settings

  3. To check that the BMC recorded the change for the next UEFI reboot to apply it:

    Copy
    Copied!
                

    curl -k -u root:<password> -H 'content-type: application/json' -X GET https://<DPU-BMC-IP>/redfish/v1/Systems/Bluefield/Bios/Settings

    Warning

    Reset the DPU (Arm and NIC) for the mode change to take effect.

  4. To verify that the NIC mode has updated accordingly:

    Copy
    Copied!
                

    curl -k -u root:<password> -H 'content-type: application/json' -X GET https://<DPU-BMC-IP>/redfish/v1/Systems/Bluefield/Bios/

  1. Use Redfish BIOS settings schema over the 1GbE OOB to the DPU BMC:

    Copy
    Copied!
                

    curl -k -X PATCH -d '{"Attributes":{"Internal CPU Model": "Restricted"}}' -u root:<password> https://<DPU-BMC-IP>/redfish/v1/Systems/<SystemID>/Bios/Settings | python3 -m json.tool

    The available BlueField host privilege levels are Restricted and Privileged. The default is Privileged, where the host has access to DPU.

  2. Change the privilege level to Restricted.

Warning

Changing host privilege level requires DPU reset for the change to take effect.

Note

For more information on BlueField Operational modes, refer to this page.

As part of the default settings of the DPU, UEFI Secure Boot is enabled and requires no special configuration to use it with the bundled Ubuntu OS shipped with the BlueField DPU. Disabling UEFI Secure Boot may be necessary when running an unsigned Arm OS image, such as a customer OS. Using Redfish Secure Boot schema over 1GbE to DPU BMC, run:

Copy
Copied!
            

curl -k -u root:<password> -H "Content-Type: application/octet-stream" -X GET https://<DPU-BMC-IP>/redfish/v1/Systems/Bluefield/SecureBoot { "@odata.id": "/redfish/v1/Systems/Bluefield/SecureBoot", "@odata.type": "#SecureBoot.v1_1_0.SecureBoot", "Description": "The UEFI Secure Boot associated with this system.", "Id": "SecureBoot", "Name": "UEFI Secure Boot", "SecureBootCurrentBoot": "Enabled", "SecureBootEnable": true, "SecureBootMode": "SetupMode" } curl -k -u root:<password> -X PATCH https://<DPU-BMC-IP>/redfish/v1/Systems/Bluefield/SecureBoot -H 'Content-Type: application/json' -d '{"SecureBootCurrentBoot": "Enabled", "SecureBootEnable": true, "SecureBootMode": "SetupMode"}'

For more information on user management, review this page.

© Copyright 2024, NVIDIA. Last updated on Mar 27, 2024.