NVIDIA BlueField Platform Software Troubleshooting Guide

SoC Platform

This page provides information to troubleshoot problems related to installing and operating a BlueField networking platform (DPU/SuperNIC).

Command

Description

bfcfg

Create /etc/bf.cfg and run bf.cfg to apply a configuration change from the BlueField Arm Linux. Or run bfcfg -d to dump the current configuration.

bfver

Obtain the software versions by executing the bfver command from Bluefield Arm console.

bfver can also be executed from host server against a standalone BFB file.

bfsbdump

Gather the target device information. Execute bfsbdump from the BlueField Arm console.

bfsbverify

Inspect the content of the boot image, identify the Root-of-Trust Public Key (ROTPK) and verify Chain-of-Trust (CoT) certificates.

Execute bfsbverify either from host server or from BlueField Arm console.

bfrshlog

Display RShim log from BlueField Arm's Linux. If a string is provided, it is be added into the RShim log.

mlxbf-bootctl

Gather the target device information. Execute mlxbf-bootctl from BlueField Arm console.

fw version

Gather the firmware version by executing flint -d <dev> q full from either Arm or the host

mlx-mkbfb

Can display the content of the BFB. It also helps checking the integrity of a BFB file.

Unable to Load BL2, BL2R, or PSC Image

The following errors appear in console if images are corrupted or not signed properly:

Device

Error

BlueField

ERROR: Failed to load BL2 firmware

BlueField-2

ERROR: Failed to load BL2R firmware

BlueField-3

Failed to load PSC-BL1 or PSC VERIFY_BCT timeout


.bfb File Cannot Recognize the BlueField Board Type

The BlueField reverts to low core operation and prints the following messages:

Copy
Copied!
            

***System type can't be determined*** ***Booting as a minimal system**


Ubuntu Kernel Debug

To install kernel debug symbols, run the following:

Copy
Copied!
            

# sudo add-apt-repository ppa:canonical-kernel-bluefield/release # echo "deb https://ppa.launchpadcontent.net/canonical-kernel-bluefield/release/ubuntu/ jammy main/debug" | sudo tee -a /etc/apt/sources.list.d/canonical-kernel-bluefield-ubuntu-release-jammy.list # sudo apt update # sudoapt install linux-image-$(uname -r)-dbgsym E.g.: # sudo apt-cache policy linux-image-unsigned-5.15.0-1043-bluefield-dbgsym # sudo apt install linux-image-unsigned-5.15.0-1043-bluefield-dbgsym


Host is Not Available and BlueField is Stuck in Boot

In situations where there is no OOB to the BlueField or the BlueField is stuck in boot, a reset can be sent from the BMC to the BlueField.

BlueField Target is Stuck in UEFI Menu

Upgrade to the latest stable boot partition images, see "How to upgrade the boot partition (ATF & UEFI) without re-installation".

Unable to Burn FW from Host Server

Please verify that you are not in running in isolated mode. Run:

Copy
Copied!
            

$ sudo mlxprivhost -d /dev/mst/mt41686_pciconf0 q Current device configurations: ------------------------------ level : PRIVILEGED ...

By default, Bluefield operates in privileged mode. Please refer to "NVIDIA BlueField Modes of Operation" for more information.

Server Unable to Find the BlueField

  • Ensure that the BlueField is placed correctly

  • Make sure the BlueField slot and the BlueField are compatible

  • Install the BlueField in a different PCIe slot

  • Use the drivers that came with the BlueField or download the latest

  • Make sure your motherboard has the latest BIOS

  • Perform a graceful shutdown then power cycle the server

BlueField No Longer Works

  • Reseat the BlueField in its slot or a different slot, if necessary

  • Try using another cable

  • Reinstall the drivers for the network driver files may be damaged or deleted

  • Perform a graceful shutdown then power cycle the server

BlueField stopped working after installing another BFB

  • Try removing and reinstalling all BlueField devices

  • Check that cables are connected properly

  • Make sure your motherboard has the latest BIOS

Link Indicator Light is Off

  • Try another port on the switch

  • Make sure the cable is securely attached

  • Check you are using the proper cables that do not exceed the recommended lengths

  • Verify that your switch and BlueField port are compatible

Link Light is On but No Communication is Established

  • Check that the latest driver is loaded

  • Check that both the BlueField and its link are set to the same speed and duplex settings

DPU is Stuck at Runtime

If the BlueField DPU appears unresponsive or stuck during runtime, follow these steps to collect debug information and trigger diagnostics.

  1. Check logs for runtime exceptions.

    • Review the RShim log on the host:

      Copy
      Copied!
                  

      sudo journalctl -u rshim

    • Inspect the UART console log for crash traces, kernel panics, or exceptions emitted by the DPU.

  2. Use SysRq to trigger diagnostic actions. If the DPU appears frozen, you can use the SysRq interface to trigger diagnostic outputs or a controlled crash for debugging.

    • Enable SysRq functionality.

      • Ensure the host enables SysRq support:

        Copy
        Copied!
                    

        echo 1 > /proc/sys/kernel/sysrq

      • (Optional) Enable console output:

        Copy
        Copied!
                    

        echo 7 > /proc/sys/kernel/printk

        Info

        You may also use a specific bitmask value in /proc/sys/kernel/sysrq to enable selective SysRq actions.

    • You can send SysRq commands to the DPU via the RShim console device.

    • Example for triggering a system crash (or crashdump, if configured):

      Copy
      Copied!
                  

      printf "\x0fc" > /dev/rshim0/console

      Replace c with other SysRq characters as needed (e.g., t for thread dump, m for memory info). The format is always:

      Copy
      Copied!
                  

      printf "\x0f<char>" > /dev/rshimX/console

  3. Dump DPU dmesg output via RShim (for BlueField-3 in DPU mode; Kernel<6.0). If the DPU is running Linux and supports rshim dump mode, you can extract the dmesg buffer using:

    Copy
    Copied!
                

    rshim -c -i 0 --bfdump

    This retrieves a debug log from the DPU through rshim0.

  4. RShim command-line options. Use rshim -c --help for full command-line functionality:

    Copy
    Copied!
                

    $ rshim -c --help   Usage: rshim [options]   OPTIONS: -c, --cmdmode Run in command line mode -g, --get-debug Get debug code -m, --bfdump Dump BlueField dmesg log -r, --reg <addr.[32|64] [value]> Read/write register -s, --set-debug <0 | 1> Enable or disable debug -i, --index <N> Use /dev/rshim<N>/ -h, --help Show help information

© Copyright 2025, NVIDIA. Last updated on Jul 17, 2025.