NVIDIA BlueField BSP v4.13.0

Deploying BlueField Software from Host

Info

It is recommended to upgrade your BlueField product to the latest software and firmware versions available to benefit from new features and latest bug fixes.

Note

This procedure assumes that a NVIDIA® BlueField® networking platform (DPU or SuperNIC) has already been installed in a server according to the instructions detailed in the BlueField device's hardware user guide.

Stage

Procedure

Flow

Preparation 1

Preparation – install RShim

1. Install DOCA on the host or Install RShim on the host.

2. Verify that RShim is running on the host.

Update options 2

Offline update – install BFB to BlueField 3

3. Install the BFB image (full or firmware BFB).

4. Verify installation completed successfully.

Deferred update – no-service-interruption update flow 4

3. Install the per-SKU BF-FW-Bundle in a deferred flow.

4. Verify that installation completed successfully.5. Deferred update the DOCA components.6. Apply the new version.

  1. Preparation steps 1 and 2 are common to both offline and deferred update flows.

  2. The offline and deferred update flows are mutually exclusive methods. Perform the flow that matches your use case.

  3. Recommended for Day 1 operations.

  4. Recommended for Day 2 operations.

To update the BlueField, the host server must have the RShim service running. You can install the RShim service using one of the following two methods.

Option 1: Full DOCA SDK Installation

This method is for users who want the complete DOCA software development kit. The full SDK installation includes the doca-runtime package (which provides RShim) along with all other DOCA libraries and tools.

Refer to DOCA-Host Installation and Upgrade in DOCA documentation for instructions.

Option 2: Minimal RShim Installation (via doca-runtime)

This method is for users who only need the RShim service (e.g., for firmware updates) and do not need the full DOCA SDK. The RShim service is included in the doca-runtime package.

Verify RShim Device

Before installing the RShim driver, verify that the RShim devices, which will be probed by the driver, are listed under lsusb or lspci.

Copy
Copied!
            

lspci | grep -i nox

Output example:

Copy
Copied!
            

27:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller 27:00.1 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller 27:00.2 Non-Volatile memory controller: Mellanox Technologies NVMe SNAP Controller 27:00.3 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface // This is the RShim PF

RShim is compiled as part of the doca-runtimepackage in the doca-host-repo-ubuntu<version>_amd64 file (.deb or .rpm).

Install doca-runtime

Follow the steps for your OS to install the doca-runtime package.

OS

Procedure

Ubuntu/Debian

  1. Download the DOCA host repo from the NVIDIA DOCA Downloads page .

  2. Unpack the deb repo. Run:

    Copy
    Copied!
                

    host# sudo dpkg -i doca-host-repo-ubuntu<version>_amd64.deb

  3. Perform apt update. Run:

    Copy
    Copied!
                

    host# sudo apt-get update

  4. Run apt install for DOCA runtime package.

    Copy
    Copied!
                

    host# sudo apt install doca-runtime

CentOS/RHEL 7.x

  1. Download the DOCA host repo from the NVIDIA DOCA Downloads page .

  2. Unpack the RPM repo. Run:

    Copy
    Copied!
                

    host# sudo rpm -Uvh doca-host-repo-rhel<version>.x86_64.rpm

  3. Enable new yum repos. Run:

    Copy
    Copied!
                

    host# sudo yum makecache

  4. Run yum install to install DOCA runtime package.

    Copy
    Copied!
                

    host# sudo yum install doca-runtime

CentOS/RHEL 8.x or Rocky 8.6

  1. Download the DOCA host repo from the NVIDIA DOCA Downloads page .

  2. Unpack the RPM repo. Run:

    Copy
    Copied!
                

    host# sudo rpm -Uvh doca-host-repo-rhel<version>.x86_64.rpm

  3. Enable new dnf repos. Run:

    Copy
    Copied!
                

    host# sudo dnf makecache

  4. Run dnf install to install DOCA runtime.

    Copy
    Copied!
                

    host# sudo dnf install doca-runtime

Ensure RShim Running on Host

After installing either the full DOCA SDK (Option 1) or the minimal doca-runtime package (Option 2), you must verify that the RShim service is active.

  1. Verify RShim status. Run:

    Copy
    Copied!
                

    sudo systemctl status rshim

    Expected output:

    Copy
    Copied!
                

    active (running) ... Probing pcie-0000:<BlueField's PCIe Bus address on host> create rshim pcie-0000:<BlueField's PCIe Bus address on host> rshim<N> attached

    • <N> denotes RShim device number (0, 1, 2, etc).

    • If the text another backend already attached is displayed, you would not be able to use RShim on the host.

    • If the command displays inactive or another error, restart RShim service. Run:

      Copy
      Copied!
                  

      sudo systemctl restart rshim

    • Verify RShim status again. Run:

      Copy
      Copied!
                  

      sudo systemctl status rshim

  2. Run the following to confirm the RShim device is attached. Run:

    Copy
    Copied!
                

    # cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME pcie-0000:04:00.2

    This output indicates that the RShim service is ready to use.

This updates the BlueField from the Host OS, in an offline manner, using the BFB bundle image and interrupting the current running services by the BlueField.

BFB Image Types

The BFB image is available in two formats:

  • BF-Bundle – includes BlueField firmware and BlueField Arm OS as well as DOCA

  • BF-FW-Bundle – includes BlueField firmware only

Select the appropriate image for you.

Info

Both images end with *.bfb.


Downloading the BFB Image

To download the BFB image, BF-Bundle or BF-FW-Bundle go to the NVIDIA DOCA Downloads page.

BFB Installation

This section describes how to use the bfb-install utility to install a BFB image.

Prerequisites and Key Considerations

Before starting the installation, be aware of the following:

  • Secure boot: All new BlueField-2 and all BlueField-3 devices are secure boot enabled. All software images (ATF/UEFI, Linux Kernel, etc.) must be signed to boot. All formally published software images are signed.

  • NIC Mode host reset: After a successful BFB installation in NIC mode, you must perform a power cycle on the host to apply the changes.

Installation Command

The BFB image is installed using the bfb-install utility, which is included in the RShim package.

Copy
Copied!
            

# bfb-install -h syntax: bfb-install --bfb|-b <BFBFILE> [--config|-c <bf.cfg>] \ --rshim|-r <rshimN> [--help|-h]

This utility pushes the BFB image and any optional bf.cfg file to the BlueField, then checks and prints the installation progress.

Monitoring the Installation

  • Live progress: To see a live progress bar during the image transfer, install the pv Linux tool.

  • RShim log: By default, bfb-install clears the RShim log at /dev/rshim<N>/misc and saves a copy to /tmp/bfb-install-rshim[N].log. To prevent the log from being cleared, use the --keep-log argument.

Critical Warning: Do Not Restart During Installation

Warning

BFB image installation must complete before restarting the system or the BlueField! Restarting or interrupting the process may result in anomalous behavior (e.g., the BlueField may not be accessible via SSH). If this happens, re-initiate the update process with bfb-install to recover the BlueField.


Optional Configurations (bf.cfg)

You can customize the installation by providing a bf.cfg file.

  • Updating BMC firmware: To update the BMC firmware using the BFB, you must provide the current BMC credentials and set BMC_REBOOT=yes and CEC_REBOOT=yes in the bf.cfg. This forces an automatic reset of the DPU-BMC after the installation is complete.

  • Skipping NIC firmware update: The BFB installation updates the NIC firmware by default, which triggers a reset. If this reset flow is not supported on your setup, you can skip the NIC update by providing WITH_NIC_FW_UPDATE=no in the bf.cfg file. If you do update the NIC and bfb-install alerts you that the reset failed, you must perform a BlueField System-level Reset.

Refer to "Customizing BlueField Software Deployment" for more information.

Example Installation Output

The following is an example output of a BFB installation, assuming pv is installed:

Copy
Copied!
            

# bfb-install --bfb <BlueField-BSP>.bfb --config bf.cfg --rshim rshim0 Pushing bfb + cfg 1.46GiB 0:01:11 [20.9MiB/s] [ <=> ] Collecting BlueField booting status. Press Ctrl+C to stop… INFO[PSC]: PSC BL1 START INFO[BL2]: start INFO[BL2]: boot mode (rshim) INFO[BL2]: VDDQ: 1120 mV INFO[BL2]: DDR POST passed INFO[BL2]: UEFI loaded INFO[BL31]: start INFO[BL31]: lifecycle Production INFO[BL31]: MB8: VDD adjustment complete INFO[BL31]: VDD: 743 mV INFO[BL31]: power capping disabled INFO[BL31]: runtime INFO[UEFI]: eMMC init INFO[UEFI]: eMMC probed INFO[UEFI]: UPVS valid INFO[UEFI]: PMI: updates started INFO[UEFI]: PMI: total updates: 1 INFO[UEFI]: PMI: updates completed, status 0 INFO[UEFI]: PCIe enum start INFO[UEFI]: PCIe enum end INFO[UEFI]: UEFI Secure Boot (disabled) INFO[UEFI]: exit Boot Service INFO[MISC]: : Found bf.cfg INFO[MISC]: : Ubuntu installation started INFO[MISC]: bfb_pre_install INFO[MISC]: Installing OS image INFO[MISC]: : Changing the default password for user ubuntu INFO[MISC]: : Running bfb_modify_os from bf.cfg INFO[MISC]: : Ubuntu installation finished

Verify BFB Install Completed Successfully

In DPU mode, after installation of the Ubuntu OS is complete, the following note appears in /dev/rshim0/misc on first boot:

Copy
Copied!
            

... INFO[MISC]: Linux up INFO[MISC]: DPU is ready

DPU is ready indicates that all the relevant services are up, and users can log into the system.

After the installation of the Ubuntu 22.04 BFB, the configuration detailed in the following sections is generated.

Note

Make sure all the services (including cloud-init) are started on BlueField and to perform a graceful shutdown before power cycling the host server.

BlueField OS image version is stored under /etc/mlnx-release in the BlueField:

Copy
Copied!
            

# cat /etc/mlnx-release bf-bundle-2.9.0-<version>_ubuntu-22.04_prod

Check the NIC firmware version from the host and make sure the new version is applied:

Copy
Copied!
            

# flint -d /dev/mst/mt41692_pciconf0 q Image type:            FS4 FW Version:            32.43.0366 FW Version(Running):   32.43.0318 FW Release Date:       12.10.2024 Product Version:       32.43.0318 Rom Info:              type=UEFI Virtio net version=21.4.13 cpu=AMD64,AARCH64                        type=UEFI Virtio blk version=22.4.14 cpu=AMD64,AARCH64                        type=UEFI version=14.36.12 cpu=AMD64,AARCH64                        type=PXE version=3.7.500 cpu=AMD64 Description:           UID                GuidsNumber Base GUID:             c470bd0300cbe708        38 Base MAC:              c470bdcbe708            38 Image VSD:             N/A Device VSD:            N/A PSID:                  MT_0000000001 Security Attributes:   secure-fw

If the version of the NIC firmware is different from the running firmware version as is the case in this example, then a BlueField system-level reset is required.

Info

To verify the version of the installed BMC components, refer to the BMC documentation:

In NIC mode, verify the NIC firmware and BMC components versions using Redfish.

Apply New BFB Image

To apply firmware from the BFB image, BlueField must be restarted.

Option 1: Server Power Cycle

This is the most straightforward method. It ensures that all components in the BFB (NIC, Arm OS, ATF, UEFI, BMC, and CEC) are applied.

  1. (DPU Mode only) Perform a graceful shutdown of the BlueField Arm OS.

  2. Power cycle the server to complete the restart.

Option 2: Server Reboot and Automated BMC Restart

This method uses a one-time configuration to have the BMC and CEC firmware applied automatically during the server reboot cycle.

  1. Configure your bf.cfg file with BMC credentials and set the BMC_REBOOT and CEC_REBOOT flags.

    Note

    For DPU Mode, you must perform a graceful shutdown of the BlueField Arm OS before rebooting. Failure to do so will cause the Arm side to skip the restart, and only the NIC firmware will be applied.

  2. (DPU Mode Only) Perform a graceful shutdown of the BlueField Arm OS.

  3. (DPU Mode Only) Wait for the shutdown to complete.

  4. Reboot the server.

    Info

    During this reboot, the system uses the bf.cfg settings to automatically restart the BMC. All firmware components will be applied.

Option 3: Server Reboot and Manual BMC Restart

This method applies the main firmware (NIC, Arm OS, etc.) during the server reboot, but requires a second, manual step to apply the BMC/CEC firmware.

Note

For DPU Mode, you must perform a graceful shutdown of the BlueField Arm OS before rebooting. Failure to do so causes the Arm side to skip the restart, and for only the NIC firmware to be applied.

  1. (DPU Mode only) Perform a graceful shutdown of the BlueField Arm OS.

  2. (DPU Mode only) Wait for the shutdown to complete.

  3. Reboot the server.

    Info

    At this point, only the NIC, ATF, UEFI, and BlueField Arm OS firmware are applied.

  4. Once the server is back up, log into the BlueField BMC (e.g., via Redfish) and issue a restart.

    Info

    The BMC and CEC firmware are now applied.

NVIDIA BlueField-3 supports a Deferred Update Flow, which enables administrators to update firmware and DOCA components without immediate service interruption. This capability allows a DPU or SuperNIC to continue servicing workloads while a new firmware bundle and user-space/kernel DOCA components are staged in the background.

The new versions become active only after a reset is applied, minimizing downtime in production environments.

Prerequisites

  1. Download the appropriate SKU-specific fw-bundle-*.bfb for your DPU from the DOCA Downloads page.

  2. The currently installed firmware must be at least BSP 4.13.0/DOCA 3.2.0 or later.

  3. When operating in DPU mode, credentials for DPU-BMC must be specified in /etc/bf-upgrade.conf on the Arm OS following the same format as bf.cfg. For more details, refer to "Customizing BlueField Software Deployment".

  4. Server booting in UEFI mode. The Deferred Upgrade relays on support from Server UEFI. Please ensure that Device Option ROM in the server UEFI setup, is enabled for the DPU PCIe.

  5. Make sure the following DPU UEFI ROM enablers are set to True (Enabled state):

    • On the host OS (assuming MFT is installed on host) for servers of ARM architecture:

      Copy
      Copied!
                  

      mlxconfig -d /dev/mst/<device> -y set EXP_ROM_UEFI_ARM_ENABLE=1

    • On the host OS (assuming MFT is installed on host), for servers of x86_64/Amd64 architecture:

      Copy
      Copied!
                  

      mlxconfig -d /dev/mst/<device> -y set EXP_ROM_UEFI_x86_ENABLE=1

    • On the BlueField Arm OS :

      Copy
      Copied!
                  

      mlxconfig -d /dev/mst/<device> -y set EXP_ROM_UEFI_ARM_ENABLE=1

      Info

      mlxconfig is provided by MFT installation. MFT is part of DOCA installation.

  6. (Optional) To enable a coordinated BlueField reboot with the host reboot on servers with UEFI boot mode, perform the following configuration from the BlueField Arm OS :

    Copy
    Copied!
                

    mlxconfig -d /dev/mst/<device> set INT_CPU_AUTO_SHUTDOWN=1

    Note

    This must be configured in advance as it requires a BlueField System-level Reset to take effect.

  7. Make sure that RShim is running on the host.

Step 1: Deferred Firmware Update

Firmware updates are delivered using the bfb-install tool. Each BlueField SKU requires a specific firmware bundle bfb image per-OPN available on the NVIDIA DOCA Downloads page.

The BlueField-3 firmware image includes:

  • NIC firmware

  • ATF/UEFI

  • BMC firmware

  • CEC firmware

The installation will make use of bfb-install utility:

Copy
Copied!
            

# bfb-install --help Usage: ./bfb-install [options] Options: ... -r, --rshim <device>           Rshim device, format [<ip>:<port>:]rshim<N>. -d, --deferred Deferred activation (local rshim only: formerly runtime)

To install the bfb image in a deferred flow, run on the host side:

Copy
Copied!
            

# bfb-install --deferred -r rshim0 -b <sku-fw-bundle*>.bfb

Example:

Copy
Copied!
            

bfb-install --deferred -r rshim0 -b bf-fwbundle-3.2.0-94-900-9D3B6-00CV-AA0_25.10_prod.bfb

Note

In DPU mode, updating BMC and CEC images requires providing DPU BMC credentials in /etc/bf-upgrade.conf on the Arm OS. Example:

Copy
Copied!
            

Verify successful image update. The following is an output example for NIC Mode. DPU Mode has similar output for the update part.

Copy
Copied!
            

Checking if local host has root access... Convert bf-fwbundle-3.2.0_25.10-prod.bfb to flat format for deferred upgrade INFO: Extracting BFB INFO: Extracting BFB's initramfs 1969252 blocks INFO: Extracting initramfs for repackaging 1969252 blocks Found PSID MT_0000000884 OPN 900-9D3B6-00CV-A_Ax FW version 32.47.0402 in ./opt/mellanox/mlnx-fw-updater/firmware/mlxfwmanager_sriov_dis_aarch64_41692 Extracting NIC Firmware Binary for PSID MT_0000000884 OPN 900-9D3B6-00CV-A_Ax... INFO: Rebuilding BFB in flat format BFB: /tmp/tmp.RdMnWrJ8OM/bfb/bf-fwbundle-3.2.0/MT_0000000884/flat.bfb Checking if rshim driver is running locally... Pushing bfb + cfg 143MiB 0:00:16 [8.74MiB/s] [ <=> ] Collecting BlueField booting status. Press Ctrl+C to stop… INFO[PSC]: PSC BL1 START INFO[BL2]: start INFO[BL2]: boot mode (emmc) INFO[BL2]: VDD_CPU: 851 mV INFO[BL2]: VDDQ: 1120 mV INFO[BL2]: DDR POST passed INFO[BL2]: UEFI loaded INFO[BL31]: start INFO[BL31]: lifecycle GA Secured INFO[BL31]: runtime INFO[BL31]: MB ping success INFO[UEFI]: eMMC init INFO[UEFI]: eMMC probed INFO[UEFI]: UPVS valid INFO[UEFI]: PCIe enum start INFO[UEFI]: PCIe enum end INFO[UEFI]: UEFI Secure Boot (enabled) INFO[UEFI]: Redfish enabled INFO[UEFI]: DPU-BMC RF credentials found INFO[UEFI]: exit Boot Service INFO[MISC]: Linux up INFO[MISC]: DPU is ready INFO[MISC]: Extracting BFB /tmp/bfb.MLm5D5/upgrade.bfb INFO[MISC]: Found bf.cfg INFO[MISC]: Detected Flat fwbundle BFB INFO[MISC]: Staging BMC firmware INFO[MISC]: Updating CEC firmware INFO[MISC]: Updating DPU Golden Image INFO[MISC]: Updating NIC firmware Golden Image INFO[MISC]: NIC firmware update done: 32.47.0402. NIC Firmware reset or Host power cycle is required to activate the new NIC Firmware. INFO[MISC]: Runtime upgrade finished

Note

More details of the update progress can be seen from the Arm console.

Note

Do not reset the device after firmware update. Continue to Step 2.


Step 2: Deferred Update of DOCA Components

Note

This step is only relevant when BlueField is operating in DPU mode.

For DEB-based Systems

  1. Export the desired repository:

    Copy
    Copied!
                

    export DOCA_REPO="<URL>"

    Info
    • GA: https://linux.mellanox.com/public/repo/doca/latest/ubuntu22.04/dpu-arm64

    • Latest 2.9 LTS: https://linux.mellanox.com/public/repo/doca/latest-2.9-LTS/ubuntu22.04/dpu-arm64

    • Latest 2.5 LTS: https://linux.mellanox.com/public/repo/doca/latest-2.5-LTS/ubuntu22.04/dpu-arm64

  2. Add GPG key:

    Copy
    Copied!
                

    curl $DOCA_REPO/GPG-KEY-Mellanox.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub

  3. Add DOCA repo:

    Copy
    Copied!
                

    echo "deb [signed-by=/etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub] $DOCA_REPO ./" > /etc/apt/sources.list.d/doca.list

  4. Update the APT repository indexes:

    Copy
    Copied!
                

    apt update

  5. Upgrade system packages:

    Copy
    Copied!
                

    apt upgrade

For RPM-based Systems

  1. Export the desired repository:

    Copy
    Copied!
                

    export DOCA_REPO="<URL>"

    Info
    • GA: https://linux.mellanox.com/public/repo/doca/latest/openeuler22.03sp3/dpu-arm64/

    • Latest 2.9 LTS: https://linux.mellanox.com/public/repo/doca/latest-2.9-LTS/openeuler22.03sp3/dpu-arm64/

    • Latest 2.5 LTS: https://linux.mellanox.com/public/repo/doca/latest-2.5-LTS/anolis8.6/dpu-arm64/

  2. Create repo file /etc/yum.repos.d/doca.repo:

    Copy
    Copied!
                

    echo "[doca] name=DOCA Online Repo baseurl=$DOCA_REPO enabled=1 gpgcheck=0 priority=10 cost=10" > /etc/yum.repos.d/doca.repo

  3. Update package index:

    Copy
    Copied!
                

    yum makecache

  4. Upgrade system:

    1. Unlock kernel:

      Copy
      Copied!
                  

      dnf install -y 'dnf-command(versionlock)' dnf versionlock delete kernel*

    2. Upgrade all packages:

      Copy
      Copied!
                  

      dnf upgrade --nobest

    3. Pin kernel:

      Copy
      Copied!
                  

      dnf versionlock kernel*

    4. Update GRUB:

      Copy
      Copied!
                  

      sed -i 's/^GRUB_DEFAULT=.*/GRUB_DEFAULT=0/' /etc/default/grub grub2-mkconfig -o /boot/efi/EFI/<OS_NAME>/grub.cfg

Step 3: Apply New Version

The following are different options for applying the new version:

Reset type

Mode of Operation

Applying Reset Steps

Notes

Cold Boot (AC/DC Power Cycle)

  • DPU Mode

  • NIC Mode

  1. (DPU Mode only) Gracefully shut down the BlueField Arm OS.

  2. Perform a full server power cycle.

  • The firmware update is applied automatically during power-up.

  • DPU Mode only: The BlueField Arm OS must be manually shut down before the reboot; otherwise, the update will not apply.

Standard Warm Reboot

  • DPU Mode

  • NIC Mode

  1. (DPU Mode only) Gracefully shut down the BlueField Arm OS.

  2. Perform a server warm reboot.

  • Updates firmware and software after reboot.

  • DPU Mode only: The BlueField Arm OS must be manually shut down before the reboot; otherwise, the update will not apply.

Coordinated Reset (Server + DPU)

DPU Mode

  1. When the administrator has completed all update flows, the DPU must be armed for the coordinated reset. Run the following command from the BlueField Arm OS. This sets a firmware trigger (MFRL[reset_trigger]=0x48) that instructs the DPU and its DPU-BMC to automatically reset in sync with the next host server reboot. This coordinated reset is required to apply the new firmware and software versions.

    Copy
    Copied!
                

    mlxreg -d /dev/mst/<device> -y --set "reset_trigger=0x48" --reg_name="MFRL"

  2. Perform a server warm reboot.

  • Relevant to Deferred Update Flow only.

  • The next warm reboot will:

    • Gracefully shut down BlueField Arm cores

    • Reset the NIC, Arm Complex, and BMC

    • Boot from the new firmware image


The following steps can be taken after a new version is applied.

Updating NVConfig Params from Host

  1. Optional. To reset the BlueField NIC firmware configuration (aka Nvconfig params) to their factory default values, run the following from the BlueField Arm OS or from the host OS:

    Copy
    Copied!
                

    # sudo mlxconfig -d /dev/mst/<MST device> -y reset   Reset configuration for device /dev/mst/<MST device>? (y/n) [n] : y Applying... Done! -I- Please reboot machine to load new configurations.

    Note

    For now, please ignore tool's instruction to reboot

    Note

    To learn what MST device the BlueField has on your setup, run:

    Copy
    Copied!
                

    mst start mst status

    Example output taken on a multiple BlueField host:

    Copy
    Copied!
                

    // The MST device corresponds with PCI Bus address.   MST modules: ------------ MST PCI module is not loaded MST PCI configuration module loaded   MST devices: ------------ /dev/mst/mt41692_pciconf0 - PCI configuration cycles access. domain:bus:dev.fn=0000:03:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1 Chip revision is: 01 /dev/mst/mt41692_pciconf1 - PCI configuration cycles access. domain:bus:dev.fn=0000:83:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1 Chip revision is: 01 /dev/mst/mt41686_pciconf0 - PCI configuration cycles access. domain:bus:dev.fn=0000:a3:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1 Chip revision is: 01

    The MST device IDs for the BlueField-2 and BlueField-3 devices in this example are /dev/mst/mt41686_pciconf0 and /dev/mst/mt41692_pciconf0 respectively.

  2. (Optional) Enable NVMe emulation. Run:

    Copy
    Copied!
                

    sudo mlxconfig -d <MST device> -y s NVME_EMULATION_ENABLE=1

  3. Skip this step if your BlueField is Ethernet only. Please refer to section "Supported Platforms and Interoperability" under the Release Notes to learn your BlueField type.

    If you have an InfiniBand-and-Ethernet-capable BlueField, the default link type of the ports will be configured to IB. If you want to change the link type to Ethernet, please run the following configuration:

    Copy
    Copied!
                

    sudo mlxconfig -d <MST device> -y s LINK_TYPE_P1=2 LINK_TYPE_P2=2

  4. Perform a BlueField system-level reset for the new settings to take effect.

Note

After modifying files on the BlueField, run the command sync to flush file system buffers to eMMC/SSD flash memory to avoid data loss during reboot or power cycle.


Default Network Interface Configuration

Network interfaces are configured using the netplan utility:

Copy
Copied!
            

# cat /etc/netplan/50-cloud-init.yaml # This file is generated from information provided by the datasource. Changes # to it will not persist across an instance reboot. To disable cloud-init's # network configuration capabilities, write a file # /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: # network: {config: disabled} network: ethernets: tmfifo_net0: addresses: - 192.168.100.2/30 dhcp4: false nameservers: addresses: - 192.168.100.1 routes: - metric: 1025 to: 0.0.0.0/0 via: 192.168.100.1 oob_net0: dhcp4: true renderer: NetworkManager version: 2   # cat /etc/netplan/60-mlnx.yaml network: ethernets: enp3s0f0s0: dhcp4: 'true' enp3s0f1s0: dhcp4: 'true' renderer: networkd version: 2

BlueField devices also have a local IPv6 (LLv6) derived from the MAC address via the STD stack mechanism. For a default MAC, 00:1A:CA:FF:FF:01, the LLv6 address would be fe80::21a:caff:feff:ff01.

For multi-device support, the LLv6 address works with SSH for any number of BlueField devices in the same host by including the interface name in the SSH command:

Copy
Copied!
            

host]# systemctl restart rshim // wait 10 seconds host]# ssh -6 ubuntu@fe80::21a:caff:feff:ff01%tmfifo_net<n>

Note

If tmfifo_net<n> on the host does not have an LLv6 address, restart the RShim driver:

Copy
Copied!
            

systemctl restart rshim


Default Ports and OVS Configuration

The /sbin/mlnx_bf_configure script runs automatically with ib_umad kernel module loaded (see /etc/modprobe.d/mlnx-bf.conf) and performs the following configurations:

  1. Ports are configured with switchdev mode and software steering.

  2. RDMA device isolation in network namespace is enabled.

  3. Two scalable function (SF) interfaces are created (one per port) if BlueField is configured with Embedded CPU mode (default):

    Copy
    Copied!
                

    # mlnx-sf -a show     SF Index: pci/0000:03:00.0/229408 Parent PCI dev: 0000:03:00.0 Representor netdev: en3f0pf0sf0 Function HWADDR: 02:a9:49:7e:34:29 Function trust: off Function roce: true Function eswitch: NA Auxiliary device: mlx5_core.sf.2 netdev: enp3s0f0s0 RDMA dev: mlx5_2   SF Index: pci/0000:03:00.1/294944 Parent PCI dev: 0000:03:00.1 Representor netdev: en3f1pf1sf0 Function HWADDR: 02:53:8f:2c:8a:76 Function trust: off Function roce: true Function eswitch: NA Auxiliary device: mlx5_core.sf.3 netdev: enp3s0f1s0 RDMA dev: mlx5_3

    The parameters for these SFs are defined in configuration file /etc/mellanox/mlnx-sf.conf.

    Copy
    Copied!
                

    /sbin/mlnx-sf --action create --device 0000:03:00.0 --sfnum 0 --hwaddr 02:61:f6:21:32:8c /sbin/mlnx-sf --action create --device 0000:03:00.1 --sfnum 0 --hwaddr 02:30:13:6a:2d:2c

    Note

    To avoid repeating a MAC address in the your network, the SF MAC address is set randomly upon BFB installation. You may choose to configure a different MAC address that better suit your network needs.

  4. Two OVS bridges are created:

    Copy
    Copied!
                

    # ovs-vsctl show f08652a8-92bf-4000-ba0b-7996c772aff6 Bridge ovsbr2 Port ovsbr2 Interface ovsbr2 type: internal Port p1 Interface p1 Port en3f1pf1sf0 Interface en3f1pf1sf0 Port pf1hpf Interface pf1hpf Bridge ovsbr1 Port p0 Interface p0 Port pf0hpf Interface pf0hpf Port ovsbr1 Interface ovsbr1 type: internal Port en3f0pf0sf0 Interface en3f0pf0sf0 ovs_version: "2.14.1"

    The parameters for these bridges are defined in configuration file /etc/mellanox/mlnx-ovs.conf:

    Copy
    Copied!
                

    CREATE_OVS_BRIDGES="yes" OVS_BRIDGE1="ovsbr1" OVS_BRIDGE1_PORTS="p0 pf0hpf en3f0pf0sf0" OVS_BRIDGE2="ovsbr2" OVS_BRIDGE2_PORTS="p1 pf1hpf en3f1pf1sf0" OVS_HW_OFFLOAD="yes" OVS_START_TIMEOUT=30

    Note

    If failures occur in /sbin/mlnx_bf_configure or configuration changes happen (e.g. switching to separated host mode) OVS bridges are not created even if CREATE_OVS_BRIDGES="yes".

  5. OVS HW offload is configured.

DHCP Client Configuration

Copy
Copied!
            

/etc/dhcp/dhclient.conf: send vendor-class-identifier "NVIDIA/BF/DP"; interface "oob_net0" { send vendor-class-identifier "NVIDIA/BF/OOB"; }


Ubuntu Boot Time Optimizations

Several optimizations have been applied to the Ubuntu OS image to significantly reduce boot time.

This section details the configuration changes and their potential effects.

Network Service Timeout Reduction

Network services are often a major contributor to slow boot times.

To minimize delays, the timeout for network readiness checks was reduced to 5 seconds.

Configuration files:

Copy
Copied!
            

# cat /etc/systemd/system/systemd-networkd-wait-online.service.d/override.conf [Service] ExecStart= ExecStart=/usr/bin/nm-online -s -q --timeout=5   # cat /etc/systemd/system/NetworkManager-wait-online.service.d/override.conf [Service] ExecStart= ExecStart=/usr/lib/systemd/systemd-networkd-wait-online --timeout=5   # cat /etc/systemd/system/networking.service.d/override.conf [Service] TimeoutStartSec=5 ExecStop= ExecStop=/sbin/ifdown -a --read-environment --exclude=lo --force --ignore-errors

Note

These reduced timeouts may affect DHCP-based configurations. If a network interface fails to obtain an IP address, increase the timeout values in the first two configuration files.


Grub Configuration

The GRUB bootloader timeout has been shortened to 2 seconds to accelerate boot.

Configuration in /etc/default/grub:

Copy
Copied!
            

GRUB_TIMEOUT=2 GRUB_TIMEOUT_STYLE=countdown

This displays a 2-second countdown before booting Ubuntu.

Note

With such a short timeout, the standard keys (Shift or Esc) cannot be used to enter the GRUB menu. Use the F4 key instead to access the menu.


System Services

  • Docker service – The docker.service is disabled by default in the Ubuntu OS image because it significantly increases boot time.

  • Fast reboot with kexec – The kexec utility enables faster system reboots by bypassing the hardware initialization phase. The image includes the helper script /usr/sbin/kexec_reboot for convenient execution. Example:

    Copy
    Copied!
                

    # kexec_reboot

Ubuntu Dual Boot Support

BlueField may be installed with support for dual boot. That is, two identical images of the BlueField OS may be installed using BFB.

The following is a proposed SSD partitioning layout for 119.24 GB SSD:

Copy
Copied!
            

Device Start End Sectors Size Type /dev/nvme0n1p1 2048 104447 102400 50M EFI System /dev/nvme0n1p2 104448 114550086 114445639 54.6G Linux filesystem /dev/nvme0n1p3 114550087 114652486 102400 50M EFI System /dev/nvme0n1p4 114652487 229098125 114445639 54.6G Linux filesystem /dev/nvme0n1p5 229098126 250069645 20971520 10G Linux filesystem

Where:

  • /dev/nvme0n1p1 – boot EFI partition for the first OS image

  • /dev/nvme0n1p2 – root FS partition for the first OS image

  • /dev/nvme0n1p3 – boot EFI partition for the second OS image

  • /dev/nvme0n1p4 – root FS partition for the second OS image

  • /dev/nvme0n1p5 – common partition for both OS images

For example, the following is a proposed eMMC partitioning layout for a 64GB eMMC:

Copy
Copied!
            

Device Start End Sectors Size Type /dev/mmcblk0p1 2048 104447 102400 50M EFI System /dev/mmcblk0p2 104448 50660334 50555887 24.1G Linux filesystem /dev/mmcblk0p3 50660335 50762734 102400 50M EFI System /dev/mmcblk0p4 50762735 101318621 50555887 24.1G Linux filesystem /dev/mmcblk0p5 101318622 122290141 20971520 10G Linux filesystem

Where:

  • /dev/mmcblk0p1 – boot EFI partition for the first OS image

  • /dev/mmcblk0p2 – root FS partition for the first OS image

  • /dev/mmcblk0p3 – boot EFI partition for the second OS image

  • /dev/mmcblk0p4 – root FS partition for the second OS image

  • /dev/mmcblk0p5 – common partition for both OS images

    Note

    The common partition can be used to store BFB files that will be used for OS image update on the non-active OS partition.

Installing Ubuntu OS Image Using Dual Boot

Note

For software upgrade procedure, please refer to section "Upgrading Ubuntu OS Image Using Dual Boot".

Add the values below to the bf.cfg configuration file (see section "bf.cfg Parameters" for more information).

Copy
Copied!
            

DUAL_BOOT=yes

If the eMMC size is ≤16GB, dual boot support is disabled by default, but it can be forced by setting the following parameter in bf.cfg:

Copy
Copied!
            

FORCE_DUAL_BOOT=yes

To modify the default size of the /common partition, add the following parameter:

Copy
Copied!
            

COMMON_SIZE_SECTORS=<number-of-sectors>

The number of sectors is the size in bytes divided by the block size (512). For example, for 10GB, the COMMON_SIZE_SECTORS=$((10*2**30/512)).

After assigning size for the /common partition, what remains is divided equally between the two OS images.

Copy
Copied!
            

# bfb-install --bfb <BFB> --config bf.cfg --rshim rshim0

This will result in the Ubuntu OS image to be installed twice on the BlueField.

Note

For comprehensive list of the supported parameters to customize bf.cfg during BFB installation, refer to section "bf.cfg Parameters".

Upgrading Ubuntu OS Image Using Dual Boot

  1. Download the new BFB to the BlueField into the /common partition. Use bfb_tool.py script to install the new BFB on the inactive BlueField partition:

    Copy
    Copied!
                

    /opt/mellanox/mlnx_snap/exec_files/bfb_tool.py --op fw_activate_bfb --bfb <BFB>

  2. Reset BlueField to load the new OS image:

    Copy
    Copied!
                

    /sbin/shutdown -r 0

    BlueField should now boot into the new OS image.

Use efibootmgr utility to manage the boot order if necessary.

  • Change the boot order with:

    Copy
    Copied!
                

    # efibootmgr -o

    Note

    Modifying the boot order with efibootmgr -o does not remove unused boot options. For example, changing a boot order from 0001,0002, 0003 to just 0001 does not actually remove 0002 and 0003. 0002 and 0003 need to be explicitly removed using efibootmgr -B .

  • Remove stale boot entries with:

    Copy
    Copied!
                

    # efibootmgr -b <E> -B

    Where <E> is the last character of the boot entry (i.e., Boot000<E>). You can find that by running:

    Copy
    Copied!
                

    # efibootmgr BootCurrent: 0040 Timeout: 3 seconds BootOrder: 0040,0000,0001,0002,0003 Boot0000* NET-NIC_P0-IPV4 Boot0001* NET-NIC_P0-IPV6 Boot0002* NET-NIC_P1-IPV4 Boot0003* NET-NIC_P1-IPV6 Boot0040* focal0 ....2

© Copyright 2025, NVIDIA. Last updated on Nov 20, 2025