SoC Platform
This page provides information to troubleshoot problems related to installing and operating a BlueField networking platform (DPU/SuperNIC).
Command |
Description |
|
Create |
|
Obtain the software versions by executing the
|
|
Gather the target device information. Execute
|
|
Inspect the content of the boot image, identify the Root-of-Trust Public Key (ROTPK) and verify Chain-of-Trust (CoT) certificates.
Execute |
|
Display RShim log from BlueField Arm's Linux. If a string is provided, it is be added into the RShim log. |
|
Gather the target device information. Execute
|
|
Gather the firmware version by executing |
|
Can display the content of the BFB. It also helps checking the integrity of a BFB file. |
Unable to Load BL2, BL2R, or PSC Image
The following errors appear in console if images are corrupted or not signed properly:
Device |
Error |
BlueField |
|
BlueField-2 |
|
BlueField-3 |
|
.bfb File Cannot Recognize the BlueField Board Type
The BlueField reverts to low core operation and prints the following messages:
***System type can't be determined***
***Booting as a minimal system**
Ubuntu Kernel Debug
To install kernel debug symbols, run the following:
# sudo add-apt-repository ppa:canonical-kernel-bluefield/release
# echo "deb https://ppa.launchpadcontent.net/canonical-kernel-bluefield/release/ubuntu/ jammy main/debug" | sudo tee -a /etc/apt/sources.list.d/canonical-kernel-bluefield-ubuntu-release-jammy.list
# sudo apt update
# sudoapt install linux-image-$(uname -r)-dbgsym
E.g.:
# sudo apt-cache policy linux-image-unsigned-5.15.0-1043-bluefield-dbgsym
# sudo apt install linux-image-unsigned-5.15.0-1043-bluefield-dbgsym
Host is Not Available and BlueField is Stuck in Boot
In situations where there is no OOB to the BlueField or the BlueField is stuck in boot, a reset can be sent from the BMC to the BlueField.
BlueField Target is Stuck in UEFI Menu
Upgrade to the latest stable boot partition images, see "How to upgrade the boot partition (ATF & UEFI) without re-installation".
Unable to Burn FW from Host Server
Please verify that you are not in running in isolated mode. Run:
$ sudo mlxprivhost -d /dev/mst/mt41686_pciconf0 q
Current device configurations:
------------------------------
level : PRIVILEGED
...
By default, Bluefield operates in privileged mode. Please refer to "NVIDIA BlueField Modes of Operation" for more information.
Server Unable to Find the BlueField
Ensure that the BlueField is placed correctly
Make sure the BlueField slot and the BlueField are compatible
Install the BlueField in a different PCIe slot
Use the drivers that came with the BlueField or download the latest
Make sure your motherboard has the latest BIOS
Perform a graceful shutdown then power cycle the server
BlueField No Longer Works
Reseat the BlueField in its slot or a different slot, if necessary
Try using another cable
Reinstall the drivers for the network driver files may be damaged or deleted
Perform a graceful shutdown then power cycle the server
BlueField stopped working after installing another BFB
Try removing and reinstalling all BlueField devices
Check that cables are connected properly
Make sure your motherboard has the latest BIOS
Link Indicator Light is Off
Try another port on the switch
Make sure the cable is securely attached
Check you are using the proper cables that do not exceed the recommended lengths
Verify that your switch and BlueField port are compatible
Link Light is On but No Communication is Established
Check that the latest driver is loaded
Check that both the BlueField and its link are set to the same speed and duplex settings