Deploying BlueField Software Using BFB from BMC
It is recommended to upgrade your BlueField product to the latest software and firmware versions available to benefit from new features and latest bug fixes.
This section assumes that a BlueField DPU has already been installed in a server according to the instructions detailed in the DPU's hardware user guide.
The following table lists an overview of the steps required to install Ubuntu BFB on your DPU:
Step |
Procedure |
Direct Link |
1 |
Verify that RShim is already running on BMC |
|
2 |
Change the default credentials using bf.cfg file (optional) |
|
3 |
Install the Ubuntu BFB image |
|
4 |
Verify installation completed successfully |
|
5 |
Upgrade the firmware on your DPU |
It is important to learn your BlueField's device-id to perform some of the software installations or upgrades in this guide.
To determine the device ID of the BlueField Platform on your setup, run:
host# mst start
host# mst status -v
Example output:
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET NUMA
BlueField2(rev:1
) /dev/mst/mt41686_pciconf0.1
3b:00.1
mlx5_1 net-ens1f1 0
BlueField2(rev:1
) /dev/mst/mt41686_pciconf0 3b:00.0
mlx5_0 net-ens1f0 0
BlueField3(rev:1
) /dev/mst/mt41692_pciconf0.1
e2:00.1
mlx5_1 net-ens7f1np1 4
BlueField3(rev:1
) /dev/mst/mt41692_pciconf0 e2:00.0
mlx5_0 net-ens7f0np0 4
The device IDs for the BlueField-2 and BlueField-3 networking platforms in this example are /dev/mst/mt41686_pciconf0 and /dev/mst/mt41692_pciconf0 respectively.
Display the current setting. Run:
# cat /dev/rshim<N>/misc | grep DEV_NAME
DEV_NAME usb-1.0
This output indicates that the RShim service is ready to use. If you do not receive this output:
Restart RShim service:
sudo systemctl restart rshim
Verify the current setting again. Run:
# cat /dev/rshim<N>/misc | grep DEV_NAME
If DEV_NAME does not appear, then proceed to "RShim driver not loading on DPU with integrated BMC".
For comprehensive list of the supported parameters to customize bf.cfg during BFB installation, refer to section "bf.cfg Parameters".
Ubuntu users are prompted to change the default password (ubuntu) for the default user (ubuntu) upon first login. Logging in will not be possible even if the login prompt appears until all services are up ("DPU is ready" message appears in /dev/rshim0/misc).
Attempting to log in before all services are up prints the following message: Permission denied, please try again.
Alternatively, Ubuntu users can provide a unique password that will be applied at the end of the BFB installation. This password must be defined in a bf.cfg configuration file. To set the password for the ubuntu user:
Create password hash. Run:
# openssl passwd -1 Password: Verifying - Password: $1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1
Add the password hash in quotes to the bf.cfg file:
# vim bf.cfg ubuntu_PASSWORD='$1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1'
The bf.cfg file is used with the bfb-install script in the steps that follow.
The following table provides the password policy parameters.
Config File Path |
Parameter |
Value |
Description |
/etc/security/pwquality.conf |
minlen |
12 |
Minimum password length |
/etc/pam.d/common-password |
remember |
3 |
The number of previous passwords which cannot be reused |
/etc/security/faillock.conf |
silent |
Uncommented |
Prevents printing informative messages to the user |
deny |
10 |
The number of authentication attempts permitted before the user is locked out |
|
unlock_time |
600 |
The duration, in seconds, of the lockout period |
Each of these parameters is configurable in its respective config file indicated in the "Config File Path" column.
Please refer to the "Default Passwords and Policies" section for more password policy information.
To update the software on the BlueField DPU, the DPU must be booted up without mounting the eMMC flash device. This requires an external boot flow where a BFB (which includes ATF, UEFI, Arm OS, NIC firmware, and initramfs) is pushed from an external host via USB or PCIe. On BlueField DPUs with an integrated BMC, the USB interface is internally connected to the BMC and is enabled by default. Therefore, you must verify that the RShim driver is running on the BMC. This provides the ability to push a bootstream over the USB interface to perform an external boot.
The BFB installation procedure consists of the following main stages:
- Enabling RShim on the BMC. See section "Enable RShim on DPU BMC" for instructions.
Initiating the BFB update procedure by transferring the BFB image using one of the following options:
Direct SCP
- Running an SCP command.
Redfish interface
- Confirming the identity of the host and BMC—required only during first-time setup or after BMC factory reset.
- Sending a Simple-Update request.
Transferring BFB Image
Since the BFB is too large to store on the BMC flash or tmpfs, the image must be written to the RShim device. This can be done by either running SCP directly or using the Redfish interface.
Redfish Interface
The following is a simple sequence diagram illustrating the flow of the BFB installation process.
The following are detailed instructions outlining each step in the diagram:
Confirm the identity of the remote server (i.e., host holding the BFB image) and BMC.
InfoRequired only during first-time setup or after BMC factory reset.
Run the following on the remote server:
ssh-keyscan -t <key_type> <remote_server_ip>
Where:
- key_type – the type of key associated with the server storing the BFB file (e.g., ed25519)
- remote_server_ip – the IP address of the server hosting the BFB file
Retrieve the public key of the host holding the BFB image from the response and provide the remote server's credentials to the DPU using the following command:
curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"RemoteServerIP":"<remote_server_ip>", "RemoteServerKeyString":"<remote_server_public_key>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/Oem/NvidiaUpdateService.PublicKeyExchange
Where:
- remote_server_ip – the IP address of the server hosting the BFB file
- remote_server_public_key – remote server's public key from the ssh-keyscan response, which contains both the type and the public key with a space between the two fields (i.e., "<type> <public_key>").
- bmc_ip – BMC IP address
Extract the BMC public key information (i.e., "<type> <bmc_public_key> <username>@<hostname>") from the PublicKeyExchange response and append it to the authorized_keys file on the host holding the BFB image. This enables passwordless key-based authentication for users.
{ "@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "Please add the following public key info to ~/.ssh/authorized_keys on the remote server", "MessageArgs": [ "<type> <bmc_public_key> root@dpu-bmc" ] }, { "@odata.type": "#Message.v1_1_1.Message", "Message": "The request completed successfully.", "MessageArgs": [], "MessageId": "Base.1.15.0.Success", "MessageSeverity": "OK", "Resolution": "None" } ] }
If the remote server public key must be revoked, use the following command before repeating the previous step:
curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"RemoteServerIP":"<remote_server_ip>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/Oem/NvidiaUpdateService.RevokeAllRemoteServerPublicKeys
Where:
- remote_server_ip – remote server's IP address
- bmc_ip – BMC IP address
Start BFB image transfer using the following command on the remote server:
curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"TransferProtocol":"SCP", "ImageURI":"<image_uri>","Targets":["redfish/v1/UpdateService/FirmwareInventory/DPU_OS"], "Username":"<username>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/UpdateService.SimpleUpdate
InfoAfter the BMC boots, it may take a few seconds (6-8 in NVIDIA® BlueField®-2, and 2 in BlueField-3) until the DPU BSP (DPU_OS) is up.
NoteThis command uses SCP for the image transfer, initiates a soft reset on the BlueField and then pushes the boot stream. For Ubuntu BFBs, the eMMC is flashed automatically once the bootstream is pushed. On success, a "running" message is received with the current task ID.
Where:
- image_uri – the image URI format should be <remote_server_ip>/<path_to_bfb>
- username – username on the remote server
bmc_ip – BMC IP address
Examples:
If RShim is disabled:
{ "error": { "@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "The requested resource of type Target named '/dev/rshim0/boot' was not found.", "MessageArgs": [ "Target", "/dev/rshim0/boot" ], "MessageId": "Base.1.15.0.ResourceNotFound", "MessageSeverity": "Critical", "Resolution": "Provide a valid resource identifier and resubmit the request." } ], "code": "Base.1.15.0.ResourceNotFound", "message": "The requested resource of type Target named '/dev/rshim0/boot' was not found." }
If a username or any other required field is missing:
{ "Username@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "The create operation failed because the required property Username was missing from the request.", "MessageArgs": [ "Username" ], "MessageId": "Base.1.15.0.CreateFailedMissingReqProperties", "MessageSeverity": "Critical", "Resolution": "Correct the body to include the required property with a valid value and resubmit the request if the operation failed." } ] }
If the request is valid and a task is created:
{ "@odata.id": "/redfish/v1/TaskService/Tasks/<task_id>", "@odata.type": "#Task.v1_4_3.Task", "Id": "<task_id>", "TaskState": "Running", "TaskStatus": "OK" }
Wait 2 seconds and run the following on the host to track image transfer progress:
curl -k -u root:
'<password>'
-X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/<task_id>
NoteThe transfer takes ~8 minutes for BlueField-3, and ~40 minutes for BlueField-2. During the transfer, the PercentComplete value remains at 0. If no errors occur, the TaskState is set to Running, and a keep-alive message is generated every 5 minutes with the content "Transfer is still in progress (X minutes elapsed). Please wait". Once the transfer is completed, the PercentComplete is set to 100, and the TaskState is updated to Completed.
Upon failure, a message is generated with the relevant resolution.
Where:
- bmc_ip – BMC IP address
task_id – task ID
Troubleshooting:
If host identity is not confirmed or the provided host key is wrong:
{ "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>, "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": " Unknown Host: Please provide server's public key using PublicKeyExchange ", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"
InfoIn this case, revoke the remote server key (step 1.d.), and repeat steps 1.a. to 1.c.
If the BMC identity is not confirmed:
{ "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": "Unauthorized Client: Please use the PublicKeyExchange action to receive the system's public key and add it as an authorized key on the remote server", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"
InfoIn this case, verify that the BMC key has been added correctly to the authorized_key file on the remote server.
If SCP fails:
{ "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferFailed", "Resolution": "Failed to launch SCP", "Severity": "Critical" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical"
The keep-alive message:
{ "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": " <file_name>' is being transferred to '/dev/rshim0/boot'.", "MessageArgs": [ " <file_name>", "/dev/rshim0/boot" ], "MessageId": "Update.1.0.TransferringToComponent", "Resolution": "Transfer is still in progress (5 minutes elapsed): Please wait", "Severity": "OK" } … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Running", "TaskStatus": "OK"
Upon completion of transfer of the BFB image to the DPU, the following is received:
{ "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Device 'DPU' successfully updated with image '<file_name>'.", "MessageArgs": [ "DPU", "<file_name>" ], "MessageId": "Update.1.0.UpdateSuccessful", "Resolution": "None", "Severity": "OK" }, … "PercentComplete": 100, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Completed", "TaskStatus": "OK"
When the BFB transfer is complete, dump the current RShim miscellaneous messages to check the update status.
InfoRefer to section "BMC Dump Operations" under "BMC and BlueField Logs" for information on dumping the rshim.log which contains the current RShim miscellaneous messages.
Verify that the new BFB is running by checking its version:
curl -k -u root:'<password>' -H "Content-Type: application/json" -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/DPU_OS
Direct SCP
scp <path_to_bfb> root@<bmc_ip>:/dev/rshim0/boot
If bf.cfg is required as part of the boot process, run:
cat <path_to_bfb> bf.cfg > new.bfb
scp <path to new.bfb> root@<bmc_ip>:/dev/rshim0/boot
For comprehensive list of the supported parameters to customize bf.cfg during BFB installation, refer to section "bf.cfg Parameters".
After installation of the Ubuntu OS is complete, the following note appears in /dev/rshim0/misc on first boot:
...
INFO[MISC]: Linux up
INFO[MISC]: DPU is ready
"DPU is ready" indicates that all the relevant services are up and users can login the system.
After the installation of the Ubuntu 20.04 BFB, the configuration detailed in the following sections is generated.
Make sure all the services (including cloud-init) are started on BlueField and to perform a graceful shutdown before power cycling the host server.
BlueField OS image version is stored under /etc/mlnx-release in the BlueField:
# cat /etc/mlnx-release
bf-bundle-2.7.0-<version>_ubuntu-22.04_prod
To upgrade firmware:
Access the BlueField using one of the available interfaces (RShim console, BMC console, SSH via oob_net0 or tmfifo_net0 interfaces).
Upgrade the firmware on the DPU. Run:
sudo /opt/mellanox/mlnx-fw-updater/mlnx_fw_updater.pl --force-fw-update
Example output:
Device #1: ---------- Device Type: BlueField-2 [...] Versions: Current Available FW <Old_FW> <New_FW>
NoteImportant! To apply NVConfig changes, stop here and follow the steps in section "Updating NVConfig Params". In this case, the following step #3 is redundant.
Perform a BlueField system reboot for the upgrade to take effect.
Optional. To reset the BlueField NIC firmware configuration (aka Nvconfig params) to their factory default values, run the following from the BlueField ARM OS or from the host OS:
# sudo mlxconfig -d /dev/mst/<MST device> -y reset Reset configuration for device /dev/mst/<MST device>? (y/n) [n] : y Applying... Done! -I- Please reboot machine to load new configurations.
NoteFor now, please ignore tool's instruction to reboot
NoteTo learn what MST device the BlueField DPU has on your setup, run:
mst start mst status
Example output taken on a multiple DPU host:
// The MST device corresponds with PCI Bus address. MST modules: ------------ MST PCI module is not loaded MST PCI configuration module loaded MST devices: ------------ /dev/mst/mt41692_pciconf0 - PCI configuration cycles access. domain:bus:dev.fn=0000:03:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1 Chip revision is: 01 /dev/mst/mt41692_pciconf1 - PCI configuration cycles access. domain:bus:dev.fn=0000:83:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1 Chip revision is: 01 /dev/mst/mt41686_pciconf0 - PCI configuration cycles access. domain:bus:dev.fn=0000:a3:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1 Chip revision is: 01
The MST device IDs for the BlueField-2 and BlueField-3 DPUs in this example are /dev/mst/mt41686_pciconf0 and /dev/mst/mt41692_pciconf0 respectively.
(Optional) Enable NVMe emulation. Run:
sudo mlxconfig -d <MST device> -y s NVME_EMULATION_ENABLE=1
Skip this step if your BlueField DPU is Ethernet only. Please refer to section "Supported Platforms and Interoperability" under the Release Notes to learn your DPU type.
If you have a VPI DPU, the default link type of the ports will be configured to IB. If you want to change the link type to Ethernet, please run the following configuration:
sudo mlxconfig -d <MST device> -y s LINK_TYPE_P1=2 LINK_TYPE_P2=2
- Perform a BlueField system-level reset for the new settings to take effect.