NVIDIA BlueField Platform Software Troubleshooting Guide

PCIe

This page offers troubleshooting information for PCIe.

Missing PCIe Express Device

The discovery and operation of PCIe devices involve multiple stages, and errors at any stage can render a device inaccessible to the operating system (OS). PCIe devices are organized in a tree-like hierarchy, with each node connected via PCIe links. For a device to be accessible, all links between the root port and the endpoint device must be successfully trained and active. While link training is managed by hardware, failure at this stage results in the unavailability of all downstream devices. Tools such as lspci can be used to verify the status of downstream links for root ports and switches.

In the following example, the LnkSta line shows the link operating correctly. TrErr- signifies there were no training errors and DLActive+ signifies the link is up.

Copy
Copied!
            

# lspci -vv -s 5:0.0 05:00.0 PCI bridge: Mellanox Technologies MT43244 Family [BlueField-3 SoC PCIe Bridge] (rev 01) (prog-if 00 [Normal decode])         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-         Latency: 0         Interrupt: pin ? routed to IRQ 57         IOMMU group: 5         Bus: primary=05, secondary=06, subordinate=06, sec-latency=0         I/O behind bridge: 00000000-00000fff [size=4K]         Memory behind bridge: 00200000-003fffff [size=2M]         Prefetchable memory behind bridge: 0000800005000000-00008000051fffff [size=2M]         Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-         BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-                 PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-         Capabilities: [60] Express (v2) Downstream Port (Slot+), MSI 00                 DevCap: MaxPayload 512 bytes, PhantFunc 0                         ExtTag- RBE+                 DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-                         MaxPayload 128 bytes, MaxReadReq 128 bytes                 DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-                 LnkCap: Port #1, Speed 32GT/s, Width x2, ASPM not supported                         ClockPM- Surprise+ LLActRep+ BwNot- ASPMOptComp+                 LnkCtl: ASPM Disabled; Disabled- CommClk-                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-                 LnkSta: Speed 8GT/s (downgraded), Width x2 (ok)                         TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-


Enumeration

The next stage is PCIe enumeration, a process by which software discovers all devices present in the PCIe fabric. This is accomplished by reading the first register of every possible device address to determine which devices respond. The first register contains the vendor ID and device ID, which uniquely identify the device. PCIe enumeration occurs twice during boot: once by UEFI and then again by Linux.

Devices detected during Linux PCIe enumeration are listed by lspci. If a device appears here, it indicates that the device is present in the system and has responded correctly to a configuration read. However, this does not guarantee the functionality of the device or its associated driver.

Resource Allocation

After enumeration, the OS performs PCIe resource allocation. If resource allocation fails, some devices may become unavailable to the OS. There are three types of PCIe resources:

  • I/O space

  • Bus numbers

  • Memory space

The BlueField platform does not support PCIe I/O space, leaving bus numbers and memory space as the primary considerations. The system supports 255 buses, making exhaustion of this resource unlikely. However, insufficient PCIe memory space can lead to errors, which are often logged in dmesg.

Copy
Copied!
            

[    0.781698] pci 0000:21:00.0: BAR 6: no space for [mem size 0x00100000 pref] [    0.781700] pci 0000:21:00.0: BAR 6: failed to assign [mem size 0x00100000 pref] [    0.781703] pci 0000:21:00.1: BAR 6: no space for [mem size 0x00100000 pref] [    0.781705] pci 0000:21:00.1: BAR 6: failed to assign [mem size 0x00100000 pref]

There are two types of PCIe memory space:

  • 32-bit memory space – BlueField-3 supports 2 GB, ranging from 0x7FFF_0000_0000 to 0x7FFF_7FFF_FFFF.

  • 64-bit memory space – BlueField-3 supports 128 TB, ranging from 0x8000_0000_0000 to 0xFFFF_FFFF_FFFF.

Despite the large capacity of 64-bit memory space, exhaustion is still possible. This can occur due to devices that support a limited number of address bits or alignment requirements, which may leave large chunks of address space unusable. If memory space allocation fails, reviewing /proc/iomem can be helpful. This file lists all available memory ranges and their allocation status for each device.

Depending on the Linux configuration, it may either retain the resource allocation performed by UEFI or reallocate resources independently. This behavior can be controlled using the kernel command-line options pci=realloc=on or pci=realloc=off.

Device Drivers

If enumeration and resource allocation succeed but the device services are still not available, then the issue is likely with the driver. If lspci -v shows a line labeled Kernel driver in use or Kernel modules, then the device driver is successfully attached to that device. In the following example, it is the NVMe driver:

Copy
Copied!
            

# lspci -v -s 6:0.0 06:00.0 Non-Volatile memory controller: KIOXIA Corporation NVMe SSD Controller BG4 (DRAM-less) (prog-if 02 [NVM Express])         Subsystem: KIOXIA Corporation NVMe SSD Controller BG4 (DRAM-less)         Physical Slot: 0         Flags: bus master, fast devsel, latency 0, IRQ 61, IOMMU group 6         Memory at 7fff00200000 (64-bit, non-prefetchable) [size=16K]         Capabilities: [40] Express Endpoint, MSI 00         Capabilities: [80] Power Management version 3         Capabilities: [90] MSI: Enable- Count=1/32 Maskable+ 64bit+         Capabilities: [b0] MSI-X: Enable+ Count=32 Masked-         Capabilities: [100] Advanced Error Reporting         Capabilities: [150] Virtual Channel         Capabilities: [260] Latency Tolerance Reporting         Capabilities: [300] Secondary PCI Express         Capabilities: [400] L1 PM Substates         Kernel driver in use: nvme         Kernel modules: nvme

If that line is missing, then the driver is either missing or the attachment failed. In either case, searching for the name of the driver in the dmesg output should provide more information.

UEFI Enumeration

If debugging from Linux is difficult or not available, the UEFI Internal Shell can be used to see the results of PCIe enumeration as done by UEFI. To enter the shell, press Esc on the console when UEFI starts to boot. From the menu, select Boot Manager and then scroll down to EFI Internal Shell. The relevant commands are pci, devices, and drivers. The help command will provide usage information for each command.

Example pci command output:

Copy
Copied!
            

Shell> pci     Seg  Bus  Dev  Func    ---  ---  ---  ----     00   00   00    00 ==> Bridge Device - PCI/PCI bridge              Vendor 15B3 Device A2DA Prog Interface 0     00   01   00    00 ==> Bridge Device - PCI/PCI bridge              Vendor 15B3 Device 197B Prog Interface 0     00   02   00    00 ==> Bridge Device - PCI/PCI bridge              Vendor 15B3 Device 197B Prog Interface 0     00   02   03    00 ==> Bridge Device - PCI/PCI bridge              Vendor 15B3 Device 197B Prog Interface 0     00   03   00    00 ==> Network Controller - Ethernet controller              Vendor 15B3 Device A2DC Prog Interface 0     00   03   00    01 ==> Network Controller - Ethernet controller              Vendor 15B3 Device A2DC Prog Interface 0     00   04   00    00 ==> Bridge Device - PCI/PCI bridge              Vendor 15B3 Device 197B Prog Interface 0     00   05   00    00 ==> Bridge Device - PCI/PCI bridge              Vendor 15B3 Device 197B Prog Interface 0     00   06   00    00 ==> Mass Storage Controller - Non-volatile memory subsystem              Vendor 1E0F Device 0001 Prog Interface 2

Missing PCIe Devices

If running lspci on the BlueField produces no output and all PCIe devices are missing, this indicates that the device is in Livefish mode. In this case, the NIC firmware must be reinstalled.

Insufficient Power on the PCIe Slot

If you see the error Insufficient power on the PCIe slot in dmesg, consult the "Specifications" page of your BlueField device's hardware user guide to ensure that it is receiving the appropriate power supply.

To check the power capacity of your host's PCIe slots, execute the command lspci -vvv | grep PowerLimit. For instance:

Copy
Copied!
            

# lspci -vvv | grep PowerLimit Slot #6, PowerLimit 75.000W; Interlock- NoCompl- Slot #1, PowerLimit 75.000W; Interlock- NoCompl- Slot #4, PowerLimit 75.000W; Interlock- NoCompl-

Note

This command is not supported by all host vendors or types.


Obtaining the Complete PCIe Device Description

The lspci command may not display the complete descriptions of NVIDIA PCIe devices connected to the host system. For example:

Copy
Copied!
            

# lspci | grep -i Mellanox a3:00.0 Infiniband controller: Mellanox Technologies Device a2d6 (rev 01) a3:00.1 Infiniband controller: Mellanox Technologies Device a2d6 (rev 01) a3:00.2 DMA controller: Mellanox Technologies Device c2d3 (rev 01)

To obtain the full descriptions of these devices, run:

Copy
Copied!
            

# update-pciids

Once the PCIe device ID database has been updated, the lspci command should display detailed information for each device. For example:

Copy
Copied!
            

# lspci | grep -i Mellanox a3:00.0 Infiniband controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01) a3:00.1 Infiniband controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01) a3:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)


Managing Two BlueField Platforms in the Same Server

This example demonstrates the procedure for managing two BlueField platforms installed in the same server. The process is similar when managing additional platforms.

This example assumes that the RShim package is already installed on the host server.

Configuring Management Interface on Host

Note

This example applies only to CentOS and RHEL operating systems.

  1. Create a br_tmfifo interface configuration file. Run:

    Copy
    Copied!
                

    vim /etc/sysconfig/network-scripts/ifcfg-br_tmfifo

    Add the following content to the file:

    Copy
    Copied!
                

    DEVICE="br_tmfifo" BOOTPROTO="static" IPADDR="192.168.100.1" NETMASK="255.255.255.0" ONBOOT="yes" TYPE="Bridge"

  2. Create a configuration file for the first BlueField platform (tmfifo_net0). Run:

    Copy
    Copied!
                

    vim /etc/sysconfig/network-scripts/ifcfg-tmfifo_net0

    Add the following content to the file:

    Copy
    Copied!
                

    DEVICE=tmfifo_net0 BOOTPROTO=none ONBOOT=yes NM_CONTROLLED=no BRIDGE=br_tmfifo

  3. Create a configuration file for the second BlueField platform (tmfifo_net1). Run:

    Copy
    Copied!
                

    vim /etc/sysconfig/network-scripts/ifcfg-tmfifo_net1

    Add the following content to the file:

    Copy
    Copied!
                

    DEVICE=tmfifo_net1 BOOTPROTO=none ONBOOT=yes NM_CONTROLLED=no BRIDGE=br_tmfifo

  4. Define rules for the tmfifo_net interfaces: Run:

    Copy
    Copied!
                

    vim /etc/udev/rules.d/91-tmfifo_net.rules

  5. Restart the network to apply the changes. Run:

    Copy
    Copied!
                

    # /etc/init.d/network restart

    Expected output:

    Copy
    Copied!
                

    Restarting network (via systemctl): [ OK ]

Configuring BlueField Platform Side

BlueField platforms are shipped with the following factory default configurations for tmfifo_net0.

Address

Value

MAC

00:1a:ca:ff:ff:01

IP

192.168.100.2

If more than one BlueField platform is in use, the default MAC and IP addresses must be modified.

Updating the RShim Network MAC Address

Note

This procedure applies to Ubuntu/Debian (with sudo), and CentOS BFB installations. It only affects tmfifo_net0 on the Arm side.

  1. Use a Linux console application (e.g., screen or minicom) to log into each BlueField platform. For example:

    Copy
    Copied!
                

    # sudo screen /dev/rshim<0|1>/console 115200

  2. Create a configuration file for the tmfifo_net0 MAC address:

    Copy
    Copied!
                

    # sudo vi /etc/bf.cfg

  3. Insert the new MAC address into the bf.cfg file:

    Copy
    Copied!
                

    NET_RSHIM_MAC=00:1a:ca:ff:ff:03

  4. Apply the new MAC address:

    Copy
    Copied!
                

    sudo bfcfg

  5. Repeat this process for the second BlueField platform, ensuring each one uses a unique MAC address.

    Info

    The Arm processor must be rebooted for the changes to take effect. To avoid unnecessary reboots, it is recommended to update the IP address before restarting the Arm.

Note

For a comprehensive list of the supported parameters to customize bf.cfg during BFB installation, refer to the "bf.cfg Parameters" section in the "Customizing BlueField Software Deployment Using bf.cfg" page.


Updating an IP Address

  • For Ubuntu:

    1. Edit the 50-cloud-init.yaml file to update the tmfifo_net0 IP address:

      Copy
      Copied!
                  

      sudo vim /etc/netplan/50-cloud-init.yaml

      Modify the entry as follows:

      Copy
      Copied!
                  

      tmfifo_net0: addresses: - 192.168.100.2/30 # Change to: - 192.168.100.3/30

    2. Reboot the Arm. Run:

      Copy
      Copied!
                  

      sudo reboot

    3. Repeat this process for the second BlueField platform, ensuring each one has a unique IP address.

      Info

      The Arm processor must be rebooted for the changes to take effect. It is recommended to update the MAC address before restarting the Arm to minimize reboots.

  • For CentOS:

    1. Edit the ifcfg-tmfifo_net0 file:

      Copy
      Copied!
                  

      # vim /etc/sysconfig/network-scripts/ifcfg-tmfifo_net0

    2. Update the IPADDR field:

      Copy
      Copied!
                  

      IPADDR=192.168.100.3

    3. Reboot the Arm processor or apply the changes:

      Copy
      Copied!
                  

      reboot

      Alternatively, use netplan apply.

    4. Repeat this process for the second BlueField DPU, ensuring a unique IP address is assigned.

      Info

      The Arm processor must be rebooted for the changes to take effect. It is recommended to update the MAC address before restarting the Arm to minimize reboots.

© Copyright 2025, NVIDIA. Last updated on Jul 16, 2025.