What can I help you with?
NVIDIA BlueField Platform Software Troubleshooting Guide

SoC Platform

This page provides information to troubleshoot problems related to installing and operating a BlueField networking platform (DPU/SuperNIC).

Command

Description

bfcfg

Create /etc/bf.cfg and run bf.cfg to apply a configuration change from the BlueField Arm Linux. Or run bfcfg -d to dump the current configuration.

bfver

Obtain the software versions by executing the bfver command from Bluefield Arm console.

bfver can also be executed from host server against a standalone BFB file.

bfsbdump

Gather the target device information. Execute bfsbdump from the BlueField Arm console.

bfsbverify

Inspect the content of the boot image, identify the Root-of-Trust Public Key (ROTPK) and verify Chain-of-Trust (CoT) certificates.

Execute bfsbverify either from host server or from BlueField Arm console.

bfrshlog

Display RShim log from BlueField Arm's Linux. If a string is provided, it is be added into the RShim log.

mlxbf-bootctl

Gather the target device information. Execute mlxbf-bootctl from BlueField Arm console.

fw version

Gather the firmware version by executing flint -d <dev> q full from either Arm or the host

mlx-mkbfb

Can display the content of the BFB. It also helps checking the integrity of a BFB file.

Unable to Load BL2, BL2R, or PSC Image

The following errors appear in console if images are corrupted or not signed properly:

Device

Error

BlueField

ERROR: Failed to load BL2 firmware

BlueField-2

ERROR: Failed to load BL2R firmware

BlueField-3

Failed to load PSC-BL1 or PSC VERIFY_BCT timeout


.bfb File Cannot Recognize the BlueField Board Type

The BlueField reverts to low core operation and prints the following messages:

Copy
Copied!
            

***System type can't be determined*** ***Booting as a minimal system**


Ubuntu Kernel Debug

To install kernel debug symbols, run the following:

Copy
Copied!
            

# sudo add-apt-repository ppa:canonical-kernel-bluefield/release # echo "deb https://ppa.launchpadcontent.net/canonical-kernel-bluefield/release/ubuntu/ jammy main/debug" | sudo tee -a /etc/apt/sources.list.d/canonical-kernel-bluefield-ubuntu-release-jammy.list # sudo apt update # sudoapt install linux-image-$(uname -r)-dbgsym E.g.: # sudo apt-cache policy linux-image-unsigned-5.15.0-1043-bluefield-dbgsym # sudo apt install linux-image-unsigned-5.15.0-1043-bluefield-dbgsym


Host is Not Available and BlueField is Stuck in Boot

In situations where there is no OOB to the BlueField or the BlueField is stuck in boot, a reset can be sent from the BMC to the BlueField.

BlueField Target is Stuck in UEFI Menu

Upgrade to the latest stable boot partition images, see "How to upgrade the boot partition (ATF & UEFI) without re-installation".

Unable to Burn FW from Host Server

Please verify that you are not in running in isolated mode. Run:

Copy
Copied!
            

$ sudo mlxprivhost -d /dev/mst/mt41686_pciconf0 q Current device configurations: ------------------------------ level : PRIVILEGED ...

By default, Bluefield operates in privileged mode. Please refer to "NVIDIA BlueField Modes of Operation" for more information.

Server Unable to Find the BlueField

  • Ensure that the BlueField is placed correctly

  • Make sure the BlueField slot and the BlueField are compatible

  • Install the BlueField in a different PCIe slot

  • Use the drivers that came with the BlueField or download the latest

  • Make sure your motherboard has the latest BIOS

  • Perform a graceful shutdown then power cycle the server

BlueField No Longer Works

  • Reseat the BlueField in its slot or a different slot, if necessary

  • Try using another cable

  • Reinstall the drivers for the network driver files may be damaged or deleted

  • Perform a graceful shutdown then power cycle the server

BlueField stopped working after installing another BFB

  • Try removing and reinstalling all BlueField devices

  • Check that cables are connected properly

  • Make sure your motherboard has the latest BIOS

Link Indicator Light is Off

  • Try another port on the switch

  • Make sure the cable is securely attached

  • Check you are using the proper cables that do not exceed the recommended lengths

  • Verify that your switch and BlueField port are compatible

Link Light is On but No Communication is Established

  • Check that the latest driver is loaded

  • Check that both the BlueField and its link are set to the same speed and duplex settings

© Copyright 2024, NVIDIA. Last updated on Nov 12, 2024.