Upgrading BlueField Software Components Using PLDM
The PLDM firmware update protocol provides a standardized, out-of-band (OOB) method for upgrading firmware components on NVIDIA® BlueField® devices. A platform update agent (typically the server's BMC) transfers the firmware image to the target device and activates it.
Each PLDM image is specific to a given BlueField-3 SKU. The BlueField-3 PLDM firmware image includes:
- NIC firmware
- ATP/UEFI
- BMC firmware
- CEC firmware
The PLDM image does not include the Arm OS or DOCA software.
- The currently installed software must be BSP 4.11.0 / DOCA 3.0.0 or later
- UEFI expansion ROM must be enabled (default is enabled).
- On Arm hosts:
EXP_ROM_UEFI_ARM_ENABLE=1- On x86 hosts:
EXP_ROM_UEFI_x86_ENABLE=1
- On x86 hosts:
- On Arm hosts:
- Auto-shutdown for the embedded CPU must be enabled (one-time, non-volatile configuration).
mlxconfig -d /dev/mst/<device> set INT_CPU_AUTO_SHUTDOWN=1
NoteThis must be configured in advance as it requires a reset to take effect.
- When operating in DPU mode, DPU-BMC credentials are required to update the BMC and CEC firmware.
- Credentials must be specified in
/etc/bf-upgrade.confon the Arm OS. - The file follows the same format as
bf.cfg.
- Credentials must be specified in
After the platform BMC transfers the PLDM firmware image and issues the ActivateFirmware command, the update must be applied using one of the methods below.
Triggering the Update
The update trigger depends on the operation mode of the BlueField device:
- NIC mode: The update is handled entirely by the platform update agent.
- DPU mode: Linux executes on the embedded Arm cores. PLDM firmware updates are handled by the
/etc/acpi/actions/bf-upgradescript, which is triggered via ACPI events.
Activating the New Firmware
Once the image is transferred, choose one of the following methods to apply the update:
Option A: Cold Boot (AC/DC Power Cycle)
On the next power cycle, the firmware update is automatically applied during boot.
Ensure the Arm cores are gracefully shut down before powering off the server.
Option B: Standard Warm Reboot
A standard warm reboot will not trigger the update unless the Arm OS is shut down first. To perform a standard warm reboot update:
- Gracefully shut down the Arm OS manually.
- Initiate a server warm reboot.
- The system will reset and update the BlueField DPU NIC and Arm complex.
Without the reset trigger or manual shutdown, warm reboot events are ignored by the BlueField device.
Option C: Coordinated Reset (Server and DPU)
With coordinated reset enabled, the next server warm reboot will automatically reset and update the BlueField NIC and Arm complex, reducing downtime.
Without the reset trigger (or a manual graceful shutdown of the Arm OS), the BlueField device will ignore the server's warm reboot event and the firmware update will not be applied.
After the PLDM update is complete and a pending firmware image exists, run the following command on the host to trigger a coordinated reset:
mlxreg -d /dev/mst/<device> -y --set "reset_trigger=c --reg_name=MFRL"
On the next warm reboot, BlueField will:
- Shut down the Arm cores.
- Reset the NIC, Arm Complex, and BMC.
- Boot from the new firmware image.