PCIe Troubleshooting and How-Tos
If the error "insufficient power on the PCIe slot" is printed in dmsg, please refer to the Specifications section of your hardware user guide and make sure that you are providing your DPU the correct amount of power.
To verify how much power is supported on your host's PCIe slots, run the command lspci -vvv | grep PowerLimit. For example:
# lspci -vvv | grep PowerLimit
Slot #6, PowerLimit 75.000W; Interlock- NoCompl-
Slot #1, PowerLimit 75.000W; Interlock- NoCompl-
Slot #4, PowerLimit 75.000W; Interlock- NoCompl-
Be aware that this command is not supported by all host vendors/types.
lspci may not present the full description for the NVIDIA PCIe devices connected to your host. For example:
# lspci | grep -i Mellanox
a3:00.0 Infiniband controller: Mellanox Technologies Device a2d6 (rev 01)
a3:00.1 Infiniband controller: Mellanox Technologies Device a2d6 (rev 01)
a3:00.2 DMA controller: Mellanox Technologies Device c2d3 (rev 01)
Please run the following command:
# update-pciids
Now you should be able to see the full description for those devices. For example:
# lspci | grep -i Mellanox
a3:00.0 Infiniband controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
a3:00.1 Infiniband controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
a3:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)
Please refer to section "Multi-board Management Example".