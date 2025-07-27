BlueField devices are equipped with a USB interface in which RShim can be routed, via USB cable, to an external host running Linux and the RShim driver. In this case, typically following a system reboot, the RShim over USB prevails and the BlueField host reports the RShim status as another backend already attached . This is correct behavior as there can only be one RShim back end active at any given time. However, this means that the BlueField host does not own RShim access. To debug an issue, the user may need to access RShim from the BlueField BMC or host, but RShim is attached to the other side (host or BMC respectively).

The user is able to reclaim RShim ownership safely without logging into the other side:

Stop the RShim driver on the remote Linux. Run: Copy Copied! systemctl stop rshim systemctl disable rshim Restart RShim on the BlueField host. Run: Copy Copied! systemctl enable rshim systemctl start rshim

This another backend already attached error can also be attributed to the RShim back end being owned by the BMC in BlueField devices with an integrated BMC. This is elaborated on further down on this page.

Verify whether your BlueField features an integrated BMC or not. Run:

Copy Copied! # sudo sudo lspci -s $(sudo lspci -d 15b3: | head -1 | awk '{print $1}') -vvv | grep "Product Name"

Example output for a BlueField with an integrated BMC:

Copy Copied! Product Name: BlueField-2 DPU 25GbE Dual-Port SFP56, integrated BMC, Crypto and Secure Boot Enabled, 16GB on-board DDR, 1GbE OOB management, Tall Bracket, FHHL

If your BlueField has an integrated BMC, refer to RShim driver not loading on host with integrated BMC.

If your BlueField does not have an integrated BMC, refer to RShim driver not loading on host on DPU without integrated BMC.

Access the BMC via the RJ45 management port of the BlueField. Delete RShim on the BMC: Copy Copied! systemctl stop rshim systemctl disable rshim Enable RShim on the host: Copy Copied! systemctl enable rshim systemctl start rshim Restart RShim service. Run: Copy Copied! sudo systemctl restart rshim If RShim service does not launch automatically, run: Copy Copied! sudo systemctl status rshim This command is expected to display active (running) . Display the current setting. Run: Copy Copied! # cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME pcie-04:00.2 (ro) This output indicates that the RShim service is ready to use.

Verify that the RShim service is not running on host. Run: Copy Copied! systemctl status rshim If the output is active , then it may be presumed that the host has ownership of the RShim. Delete RShim on the host. Run: Copy Copied! systemctl stop rshim systemctl disable rshim Enable RShim on the BMC. Run: Copy Copied! systemctl enable rshim systemctl start rshim Display the current setting. Run: Copy Copied! # cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME usb-1.0 This output indicates that the RShim service is ready to use.

Download the suitable deb/rpm for RShim (management interface for DPU from the host) driver. Reinstall RShim package on the host. For Ubuntu/Debian, run: Copy Copied! sudo dpkg --force-all -i rshim-<version>.deb

For RHEL/CentOS, run: Copy Copied! sudo rpm -Uhv rshim-<version>.rpm Restart RShim service. Run: Copy Copied! sudo systemctl restart rshim If RShim service does not launch automatically, run: Copy Copied! sudo systemctl status rshim This command is expected to display active (running) . Display the current setting. Run: Copy Copied! # cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME pcie-04:00.2 (ro) This output indicates that the RShim service is ready to use.

When starting the rshim service, the systemd journal may show an error similar to:

Copy Copied! $ sudo systemctl status rshim ... Apr 30 14 : 08 : 20 bu-lab105.wes-a.nbulabs.nvidia.com systemd[ 1 ]: Starting rshim driver for BlueField SoC... Apr 30 14 : 08 : 20 bu-lab105.wes-a.nbulabs.nvidia.com systemd[ 1 ]: Started rshim driver for BlueField SoC. Apr 30 14 : 08 : 20 bu-lab105.wes-a.nbulabs.nvidia.com rshim[ 13899 ]: Created PID file: /var/run/rshim.pid Apr 30 14 : 08 : 20 bu-lab105.wes-a.nbulabs.nvidia.com rshim[ 13899 ]: Probing pcie- 0000 :b1: 00.2 (uio) Apr 30 14 : 08 : 20 bu-lab105.wes-a.nbulabs.nvidia.com rshim[ 13899 ]: Create rshim pcie- 0000 :b1: 00.2 Apr 30 14 : 08 : 20 bu-lab105.wes-a.nbulabs.nvidia.com rshim[ 13899 ]: pcie- 0000 :b1: 00.2 enable Apr 30 14 : 08 : 21 bu-lab105.wes-a.nbulabs.nvidia.com rshim[ 13899 ]: rshim1 failed to setup CUSE rshim ...





The rshim driver depends on the cuse.ko kernel module, which is typically provided by the kernel-modules-extra package. This package is usually installed as a dependency during the RShim RPM or DEB installation.

However, on some RHEL- or Rocky Linux-based systems, this dependency may not be enforced, resulting in a missing cuse.ko module and a failure during RShim initialization.

Note Installing kernel-modules-extra may trigger a kernel upgrade if your current kernel version is not available in the configured repositories. For example, installing this package may update the kernel from 5.14.0-570.4.1 to 5.14.0-570.12.1 . This may also pull in related packages such as kernel , kernel-core , and kernel-modules .

Install kernel-modules-extra . For RHEL/Rocky Linux systems, install the package using: Copy Copied! sudo dnf install kernel-modules-extra Load the cuse module. If the installed kernel-modules-extra matches the currently running kernel, you can load the cuse.ko module: Copy Copied! sudo modprobe cuse If no errors are reported, the cuse module is now available for RShim. Restart the RShim service. Once cuse is loaded, restart the RShim service: Copy Copied! sudo systemctl restart rshim You should no longer see the failed to setup CUSE rshim error.

If modprobe cuse fails with a message about a missing module, it likely means the newly installed kernel-modules-extra version does not match the currently running kernel.

In this case, reboot the system to use the updated kernel: Copy Copied! sudo reboot

After reboot, verify the running kernel version: Copy Copied! uname -r

Ensure it matches the version of kernel-modules-extra that was installed.

In rare cases, you may need to adjust the GRUB configuration to ensure the system boots into the new kernel automatically: Copy Copied! sudo grub2-set- default 0 sudo grub2-mkconfig -o /boot/grub2/grub.cfg

The following is an informational message printed by RShim driver when trying to access via IOMMU:

Copy Copied! rshim service: /sys/bus/pci/devices/ 0000 : 01 : 00.2 /iommu_group: failed to read iommu link

The RShim driver probes RShim in the following order: IOMMU, UIO, Direct Map. It then continues the probe until success, and one mechanism failure does not mean that the RShim driver fails unless some mechanism is really necessary (such as IOMMU) when Linux kernel lockdown is enabled.