DPU Management (Day 2)

After provisioning and deploying the DPU for the first time, there are several activities required from time-to-time throughout the DPU's lifecycle.

Note

Before proceeding, review section "Management Methods" to learn about the recommended management methods for specific tasks on the BlueField DPU.

This dpu-upgrade procedure enables upgrading DOCA components using standard Linux tools (e.g., apt update and yum update). This process utilizes native package manager repositories to upgrade DPUs without the need for a full installation, and has the following benefits :

  • Only updates components that include modifications

    • Configurable – user can select specific components (e.g., UEFI-ATF, NIC-FW)

  • Includes upgrade of:

    • DOCA drivers and libraries

    • DOCA reference applications

    • BSP (UEFI/ATF) upgrade while maintaining the configuration

    • NIC firmware upgrade while maintaining the configuration

  • Does not:

    • Impact user binaries

    • Upgrade non-Ubuntu OS kernels

    • Upgrade DPU BMC firmware

  • After completion of DPU upgrade:

    • If NIC firmware was not updated, perform DPU Arm reset (software reset / reboot DPU)

    • If NIC firmware was updated, perform firmware reset (mlxfwreset) or perform a graceful shutdown and power cycle

OS

Action

Instructions

Ubuntu/

Debian

Remove mlxbf-bootimages package

Copy
Copied!
            

<dpu> $ apt remove --purge mlxbf-bootimages* -y

Install the the GPG key

Copy
Copied!
            

<dpu> $ apt update <dpu> $ apt install gnupg2

Export the desired distribution

Export DOCA_REPO with the relevant URL. The following is an example for Ubuntu 22.04:

Copy
Copied!
            

<dpu> $ export DOCA_REPO="https://linux.mellanox.com/public/repo/doca/2.5.0/ubuntu22.04/dpu-arm64"

Add GPG key to APT trusted keyring

Copy
Copied!
            

<dpu> $ curl $DOCA_REPO/GPG-KEY-Mellanox.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub

Add DOCA online repository

Copy
Copied!
            

<dpu> $ echo "deb [signed-by=/etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub] $DOCA_REPO ./" > /etc/apt/sources.list.d/doca.list

Update index

Copy
Copied!
            

<dpu> $ apt update

Upgrade UEFI/ATF firmware

Run:

Copy
Copied!
            

<dpu> $ apt install mlxbf-bootimages-signed

Then i nitiate upgrade for UEFI/ATF firmware:

Copy
Copied!
            

<dpu> $ bfrec

Upgrade BlueField DPU NIC firmware

Run:

Copy
Copied!
            

yum install mlnx-fw-updater-signed.aarch64

To prevent automatic upgrade, run:

Copy
Copied!
            

<dpu> $ export RUN_FW_UPDATER=no

Upgrade system

Copy
Copied!
            

<dpu> $ apt upgrade

Apply the new changes,

NIC firmware, and UEFI/ATF

Copy
Copied!
            

<dpu> $ mlxfwreset -d /dev/mst/mt*_pciconf0 -y -l 3 --sync 1 r

CentOS/RHEL/

Anolis/Rocky

Remove mlxbf-bootimages package

Copy
Copied!
            

<dpu> $ yum -y remove mlxbf-bootimages <dpu> $ add asterisk <dpu> $ yum makecache

Export the desired distribution

Export DOCA_REPO with the relevant URL. The following is an example for Rocky Linux 8.6:

Copy
Copied!
            

<dpu> $ export DOCA_REPO="https://linux.mellanox.com/public/repo/doca/2.5.0/rhel8.6/dpu-arm64/"

Add DOCA online repository

Copy
Copied!
            

echo [doca] \ name=DOCA Online Repo \ baseurl=$DOCA_REPO \ enabled=1 \ gpgcheck=0 > /etc/yum.repos.d/doca.repo

A file is created under /etc/yum.repos.d/doca.repo .

Update index

Copy
Copied!
            

<dpu> $ yum makecache

Upgrade UEFI/ATF firmware

Run:

Copy
Copied!
            

<dpu> $ yum install mlxbf-bootimages-signed.aarch64

Then i nitiate the upgrade for UEFI/ATF firmware:

Copy
Copied!
            

<dpu> $ bfrec

Upgrade BlueField DPU NIC firmware

Run:

Copy
Copied!
            

yum install mlnx-fw-updater-signed.aarch64

To prevent automatic upgrade, run:

Copy
Copied!
            

<dpu> $ export RUN_FW_UPDATER=no

Prevent kernel upgrades

Copy
Copied!
            

<dpu> $ yum versionlock kernel*

Upgrade system

Copy
Copied!
            

<dpu> $ yum upgrade --nobest

Apply the new changes,

NIC firmware, and UEFI/ATF

Copy
Copied!
            

<dpu> $ mlxfwreset -d /dev/mst/mt*_pciconf0 -y -l 3 --sync 1 r

Users may want to reset the DPU to factory defaults. To do that, it is necessary to reset to default the DPU BMC, DPU UEFI, NIC, and the Arm. Follow the steps in the subsections below for more.

Step 1 – Reset DPU BMC to Factory Default

  1. Run the following command:

    Copy
    Copied!
                

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST https://<DPU-BMC-IP>/redfish/v1/Managers/Bluefield_BMC/Actions/Manager.ResetToDefaults -d '{"ResetToDefaultsType": "ResetAll"}' { "@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "The request completed successfully.", "MessageArgs": [], "MessageId": "Base.1.15.0.Success", "MessageSeverity": "OK", "Resolution": "None" } ] }

  2. Reboot the BMC for the factory reset to take effect:

    Copy
    Copied!
                

    > curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"ResetType": "GracefulRestart"}' https://<DPU-BMC-IP>/redfish/v1/Managers/Bluefield_BMC/Actions/Manager.Reset { "@Message.ExtendedInfo": [ { "@odata.type": "#Message.v1_1_1.Message", "Message": "The request completed successfully.", "MessageArgs": [], "MessageId": "Base.1.13.0.Success", "MessageSeverity": "OK", "Resolution": "None" } ]

Step 2 – Sanitize DPU eMMC and SSD Storage

During the BFB installation process, DPU storage can be securely sanitized either using the shred or the mmc and nvme utilities in the bf.cfg configuration file as illustrated in the following subsections.

Warning

By default, only the installation target storage is formatted using the Linux mkfs utility.

Using shred Utility

Copy
Copied!
            

# cat bf.cfg SANITIZE_DONE=${SANITIZE_DONE:-0} export SANITIZE_DONE if [ $SANITIZE_DONE -eq 0 ]; then sleep 3m /sbin/modprobe nvme   if [ -e /dev/mmcblk0 ]; then echo Sanitizing /dev/mmcblk0 | tee /dev/kmsg echo Sanitizing /dev/mmcblk0 > /tmp/sanitize.emmc.log mmc sanitize /dev/mmcblk0 >> /tmp/sanitize.emmc.log 2>&1 fi if [ -e /dev/nvme0n1 ]; then echo Sanitizing /dev/nvme0n1 | tee /dev/kmsg echo Sanitizing /dev/nvme0n1 > /tmp/sanitize.ssd.log nvme sanitize /dev/nvme0n1 -a 2 >> /tmp/sanitize.ssd.log 2>&1 nvme sanitize-log /dev/nvme0n1 >> /tmp/sanitize.ssd.log 2>&1 fi SANITIZE_DONE=1 echo ===================== sanitize.log ===================== | tee /dev/kmsg cat /tmp/sanitize.*.log | tee /dev/kmsg sync fi bfb_modify_os() { echo ===================== bfb_modify_os ===================== | tee /dev/kmsg if ( /bin/ls -1 /tmp/sanitize.*.log > /dev/null 2>&1 ); then cat /tmp/sanitize.*.log > /mnt/root/sanitize.log fi }


Using mmc and nvme Utilities

Copy
Copied!
            

# cat bf.cfg SANITIZE_DONE=${SANITIZE_DONE:-0} export SANITIZE_DONE if [ $SANITIZE_DONE -eq 0 ]; then sleep 3m /sbin/modprobe nvme   if [ -e /dev/mmcblk0 ]; then echo Sanitizing /dev/mmcblk0 | tee /dev/kmsg echo Sanitizing /dev/mmcblk0 > /tmp/sanitize.emmc.log mmc sanitize /dev/mmcblk0 >> /tmp/sanitize.emmc.log 2>&1 fi if [ -e /dev/nvme0n1 ]; then echo Sanitizing /dev/nvme0n1 | tee /dev/kmsg echo Sanitizing /dev/nvme0n1 > /tmp/sanitize.ssd.log nvme sanitize /dev/nvme0n1 -a 2 >> /tmp/sanitize.ssd.log 2>&1 nvme sanitize-log /dev/nvme0n1 >> /tmp/sanitize.ssd.log 2>&1 fi SANITIZE_DONE=1 echo ===================== sanitize.log ===================== | tee /dev/kmsg cat /tmp/sanitize.*.log | tee /dev/kmsg sync fi bfb_modify_os() { echo ===================== bfb_modify_os ===================== | tee /dev/kmsg if ( /bin/ls -1 /tmp/sanitize.*.log > /dev/null 2>&1 ); then cat /tmp/sanitize.*.log > /mnt/root/sanitize.log fi }

Step 3 – Reset UEFI to Factory Default

Use the Redfish BIOS Settings PATCH command:

Copy
Copied!
            

curl -k -u root:'<password>' -X PATCH -d '{"Attributes":{"ResetEfiVars": true}}' https://<DPU-BMC-IP>/redfish/v1/Systems/<SystemID>/Bios/Settings | python3 -m json.tool


© Copyright 2023, NVIDIA. Last updated on Jan 10, 2024.