NVIDIA BlueField BMC Software v24.04
v24.04

Deploying BlueField Software Using BFB from BMC

To update the software on the BlueField DPU, the DPU must be booted up without mounting the eMMC flash device. This requires an external boot flow where a BFB (which includes ATF, UEFI, Arm OS, NIC firmware, and initramfs) is pushed from an external host via USB or PCIe. On BlueField DPUs with an integrated BMC, the USB interface is internally connected to the BMC and is enabled by default. Therefore, you must verify that the RShim driver is running on the BMC. This provides the ability to push a bootstream over the USB interface to perform an external boot.

Ubuntu users are prompted to change the default password (ubuntu) for the default user (ubuntu) upon first login. Logging in will not be possible even if the login prompt appears until all services are up ("DPU is ready" message appears in /dev/rshim0/misc).

Note

Attempting to log in before all services are up prints the following message: Permission denied, please try again.

Alternatively, Ubuntu users can provide a unique password that will be applied at the end of the BFB installation. This password must be defined in a bf.cfg configuration file. To set the password for the ubuntu user:

  1. Create password hash. Run:

    Copy
    Copied!
                

    # openssl passwd -1 Password: Verifying - Password: $1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1

  2. Add the password hash in quotes to the bf.cfg file:

    Copy
    Copied!
                

    # vim bf.cfg ubuntu_PASSWORD='$1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1'

    The bf.cfg file is used with the bfb-install script in the steps that follow.

The BFB installation procedure consists of the following main stages:

  1. Disabling RShim on the server.
  2. Initiating the BFB update procedure by transferring the BFB image using one of the following options:

    • Redfish interface

      1. Confirming the identity of the host and BMC—required only during first-time setup or after BMC factory reset.
      2. Sending a Simple-Update request.
    • Direct SCP

      1. Running an SCP command.

Transferring BFB Image

Since the BFB is too large to store on the BMC flash or tmpfs, the image must be written to the RShim device. This can be done by either running SCP directly or using the Redfish interface.

Redfish Interface

The following is a simple sequence diagram illustrating the flow of the BFB installation process.

redfish-transferring-bfb-image-version-1-modificationdate-1700495197953-api-v2.png

The following are detailed instructions outlining each step in the diagram:

  1. Confirm the identity of the remote server (i.e., host holding the BFB image) and BMC.

    Info

    Required only during first-time setup or after BMC factory reset.

    1. Run the following on the remote server:

      Copy
      Copied!
                  

      ssh-keyscan -t <key_type> <remote_server_ip>

      Where:

      • key_type – the type of key associated with the server storing the BFB file (e.g., ed25519)
      • remote_server_ip – the IP address of the server hosting the BFB file
    2. Retrieve the public key of the host holding the BFB image from the response and provide the remote server's credentials to the DPU using the following command:

      Copy
      Copied!
                  

      curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"RemoteServerIP":"<remote_server_ip>", "RemoteServerKeyString":"<remote_server_public_key>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/Oem/NvidiaUpdateService.PublicKeyExchange

      Where:

      • remote_server_ip – the IP address of the server hosting the BFB file
      • remote_server_public_key – remote server's public key from the ssh-keyscan response, which contains both the type and the public key with a space between the two fields (i.e., "<type> <public_key>").
      • bmc_ip – BMC IP address
    3. Extract the BMC public key information (i.e., "<type> <bmc_public_key> <username>@<hostname>") from the PublicKeyExchange response and append it to the authorized_keys file on the host holding the BFB image. This enables passwordless key-based authentication for users.

      Copy
      Copied!
                  

      { "@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "Please add the following public key info to ~/.ssh/authorized_keys on the remote server", "MessageArgs": [ "<type> <bmc_public_key> root@dpu-bmc" ] }, { "@odata.type": "#Message.v1_1_1.Message", "Message": "The request completed successfully.", "MessageArgs": [], "MessageId": "Base.1.15.0.Success", "MessageSeverity": "OK", "Resolution": "None" } ] }

    4. If the remote server public key must be revoked, use the following command before repeating the previous step:

      Copy
      Copied!
                  

      curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"RemoteServerIP":"<remote_server_ip>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/Oem/NvidiaUpdateService.RevokeAllRemoteServerPublicKeys

      Where:

      • remote_server_ip – remote server's IP address
      • bmc_ip – BMC IP address
  2. Start BFB image transfer using the following command on the remote server:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"TransferProtocol":"SCP", "ImageURI":"<image_uri>","Targets":["redfish/v1/UpdateService/FirmwareInventory/DPU_OS"], "Username":"<username>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/UpdateService.SimpleUpdate

    Info

    After the BlueField Arm boots, it may take a few seconds (6-8 in NVIDIA® BlueField®-2, and 2 in BlueField-3) until the DPU BSP (DPU_OS) is up.

    Note

    This command uses SCP for the image transfer, initiates a soft reset on the BlueField and then pushes the boot stream. For Ubuntu BFBs, the eMMC is flashed automatically once the bootstream is pushed. On success, a "running" message is received with the current task ID.

    Where:

    • image_uri – the image URI format should be <remote_server_ip>/<path_to_bfb>
    • username – username on the remote server
    • bmc_ip – BMC IP address

      Examples:

      • If server RShim is enabled:

        Copy
        Copied!
                    

        {  "error": {    "@Message.ExtendedInfo": [ {        "@odata.type": "#Message.v1_1_1.Message",        "Message": "The requested resource of type Target named '/dev/rshim0/boot' was not found.",        "MessageArgs": [          "Target",          "/dev/rshim0/boot" ],        "MessageId": "Base.1.15.0.ResourceNotFound",        "MessageSeverity": "Critical",        "Resolution": "Provide a valid resource identifier and resubmit the request." } ],    "code": "Base.1.15.0.ResourceNotFound",    "message": "The requested resource of type Target named '/dev/rshim0/boot' was not found." }

      • If a username or any other required field is missing:

        Copy
        Copied!
                    

        { "Username@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "The create operation failed because the required property Username was missing from the request.", "MessageArgs": [ "Username" ], "MessageId": "Base.1.15.0.CreateFailedMissingReqProperties", "MessageSeverity": "Critical", "Resolution": "Correct the body to include the required property with a valid value and resubmit the request if the operation failed." } ] }

      • If the request is valid and a task is created:

        Copy
        Copied!
                    

        { "@odata.id": "/redfish/v1/TaskService/Tasks/<task_id>", "@odata.type": "#Task.v1_4_3.Task", "Id": "<task_id>", "TaskState": "Running", "TaskStatus": "OK" }

  3. Wait 2 seconds and run the following on the host to track image transfer progress:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/<task_id>

    Note

    The transfer takes ~8 minutes for BlueField-3, and ~40 minutes for BlueField-2. During the transfer, the PercentComplete value remains at 0. If no errors occur, the TaskState is set to Running, and a keep-alive message is generated every 5 minutes with the content "Transfer is still in progress (X minutes elapsed). Please wait". Once the transfer is completed, the PercentComplete is set to 100, and the TaskState is updated to Completed.

    Upon failure, a message is generated with the relevant resolution.

    Where:

    1. bmc_ip – BMC IP address
    2. task_id – task ID

      Troubleshooting:

      • If host identity is not confirmed or the provided host key is wrong:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>, "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": " Unknown Host: Please provide server's public key using PublicKeyExchange ", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"

        Info

        In this case, revoke the remote server key (step 1.d.), and repeat steps 1.a. to 1.c.

      • If the BMC identity is not confirmed:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": "Unauthorized Client: Please use the PublicKeyExchange action to receive the system's public key and add it as an authorized key on the remote server", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"

        Info

        In this case, verify that the BMC key has been added correctly to the authorized_key file on the remote server.

      • If SCP fails:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": "Failed to launch SCP", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"

      • The keep-alive message:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": " <file_name>' is being transferred to '/dev/rshim0/boot'.", "MessageArgs": [ " <file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferringToComponent", "Resolution": "Transfer is still in progress (5 minutes elapsed): Please wait", "Severity": "OK" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Running", "TaskStatus": "OK"

      • Upon completion of transfer of the BFB image to the DPU, the following is received:

        Copy
        Copied!
                    

        { "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Device 'DPU' successfully updated with image '<file_name>'.", "MessageArgs": [ "DPU", "<file_name>" ], "MessageId": "Update.1.0.UpdateSuccessful", "Resolution": "None", "Severity": "OK" }, … "PercentComplete": 100, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Completed", "TaskStatus": "OK"

  4. When the BFB transfer is complete, dump the current RShim miscellaneous messages to check the update status.

    Info

    Refer to section "BMC Dump Operations" for information on dumping the rshim.log which contains the current RShim miscellaneous messages.

  5. Verify that the new BFB is running by checking its version:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/DPU_OS

Direct SCP

Copy
Copied!
            

scp <path_to_bfb> root@<bmc_ip>:/dev/rshim0/boot

© Copyright 2024, NVIDIA. Last updated on Jul 6, 2024.