RShim Troubleshooting and How-Tos
Several generations of BlueField DPUs are equipped with a USB interface in which RShim can be routed, via USB cable, to an external host running Linux and the RShim driver.
In this case, typically following a system reboot, the RShim over USB prevails and the DPU host reports RShim status as "another backend already attached". This is correct behavior, since there can only be one RShim backend active at any given time. However, this means that the DPU host does not own RShim access.
To reclaim RShim ownership safely:
Stop the RShim driver on the remote Linux. Run:
systemctl stop rshim systemctl disable rshim
Restart RShim on the DPU host. Run:
systemctl enable rshim systemctl start rshim
The "another backend already attached" scenario can also be attributed to the RShim backend being owned by the BMC in DPUs with integrated BMC. This is elaborated on further down on this page.
Verify whether your DPU features an integrated BMC or not. Run:
# sudo sudo lspci -s $(sudo lspci -d 15b3: | head -1 | awk '{print $1}') -vvv | grep "Product Name"
Example output for DPU with integrated BMC:
Product Name: BlueField-2 DPU 25GbE Dual-Port SFP56, integrated BMC, Crypto and Secure Boot Enabled, 16GB on-board DDR, 1GbE OOB management, Tall Bracket, FHHL
If your DPU has an integrated BMC, refer to RShim driver not loading on host with integrated BMC.
If your DPU does not have an integrated BMC, refer to RShim driver not loading on host on DPU without integrated BMC.
RShim driver not loading on DPU with integrated BMC
RShim driver not loading on host
Access the BMC via the RJ45 management port of the DPU.
Delete RShim on the BMC:
systemctl stop rshim systemctl disable rshim
Enable RShim on the host:
systemctl enable rshim systemctl start rshim
Restart RShim service. Run:
sudo systemctl restart rshim
If RShim service does not launch automatically, run:
sudo systemctl status rshim
This command is expected to display "active (running)".
Display the current setting. Run:
# cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME pcie-04:00.2 (ro)
This output indicates that the RShim service is ready to use.
RShim driver not loading on BMC
Verify that the RShim service is not running on host. Run:
systemctl status rshim
If the output is active, then it may be presumed that the host has ownership of the RShim.
Delete RShim on the host. Run:
systemctl stop rshim systemctl disable rshim
Enable RShim on the BMC. Run:
systemctl enable rshim systemctl start rshim
Display the current setting. Run:
# cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME usb-1.0
This output indicates that the RShim service is ready to use.
RShim driver not loading on host on DPU without integrated BMC
Download the suitable DEB/RPM for RShim (management interface for DPU from the host) driver.
Reinstall RShim package on the host.
For Ubuntu/Debian, run:
sudo dpkg --force-all -i rshim-<version>.deb
For RHEL/CentOS, run:
sudo rpm -Uhv rshim-<version>.rpm
Restart RShim service. Run:
sudo systemctl restart rshim
If RShim service does not launch automatically, run:
sudo systemctl status rshim
This command is expected to display "active (running)".
Display the current setting. Run:
# cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME pcie-04:00.2 (ro)
This output indicates that the RShim service is ready to use.
Verify that your card has BMC. Run the following on the host:
# sudo sudo lspci -s $(sudo lspci -d 15b3: | head -
1
| awk'{print $1}'
) -vvv |grep"Product Name"
Product Name: BlueField-2
DPU 25GbE Dual-Port SFP56, integrated BMC, Crypto and Secure Boot Enabled, 16GB on-board DDR, 1GbE OOB management, Tall Bracket, FHHLThe product name is supposed to show "integrated BMC" .
Access the BMC via the RJ45 management port of the DPU.
Delete RShim on the BMC:
systemctl stop rshim systemctl disable rshim
Enable RShim on the host:
systemctl enable rshim systemctl start rshim
Restart RShim service. Run:
sudo systemctl restart rshim
If RShim service does not launch automatically, run:
sudo systemctl status rshim
This command is expected to display "active (running)".
Display the current setting. Run:
# cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME pcie-04:00.2 (ro)
This output indicates that the RShim service is ready to use.
For more information, refer to section "RShim Multiple Board Support".
The BFB installation flow can be traced using various interfaces:
From the host:
RShim console (/dev/rshim0/console)
RShim log buffer (/dev/rshim0/misc); also included in bfb-install's output
UART console (/dev/ttyUSB0)
From the BMC console:
SSH to the BMC and run obmc-console-client
NoteAdditional information about BMC interfaces is available in BMC software documentation
From the DPU:
/root/<OS>.installation.log available on the DPU OS after installation