NVIDIA BlueField DPU BSP v3.9.8 LTS

Bug Fixes History

Ref #

Issue Description


Description: Virtio-net is intermittently unable to configure the MQ correctly, showing error message similar to virtnet_handle_mq: failed to set mq 16 in the controller message.

Keyword: Virtio-net

Fixed in version: 3.9.7


Description: While running iperf on virtio hotplug devices with a guest OS running CentOS kernel 3.10, performing unplug may results in guest kernel stuck and failed to do unplug due to timeout.

Keyword: Virtio-net; hotplug; iperf

Fixed in version: 3.9.7


Description: Fixed CVE-2022-47630.

Keyword: Security

Fixed in version: 3.9.6


Description: To install DOCA on an Ubuntu 22.04 host, use the command apt-get install doca-runtime doca-sdk doca-tools openvswitch-switch -y.

Keyword: Installation; Ubuntu 22.04; OVS; openvswitch-switch

Fixed in version: 3.9.6


Description: When HIDE_PORT2_PF=True NUM_OF_PF=1, cat /sys/class/net/p1/smart_nic/pf/config causes a kernel crash.

Keyword: Kernel; crash

Fixed in version: 3.9.6


Description: Kernel crash when using filtering rule with nftable.

Keyword: Netfilter; iptables; nftable

Fixed in version: 3.9.6


Description: The l2_reflector reference application fails to start due to missing libflexio.so library.

Keyword: FlexIO; DOCA applications; l2_reflector

Fixed in version: 3.9.6


Description: Assert errors may be observed in the RShim log after reset/reboot. These errors are harmless and may be ignored.

Keywords: RShim; log; error

Fixed in version: 3.9.6


Description: Hotplug of a modern virtio-net device is not supported when VIRTIO_EMULATION_HOTPLUG_TRANS is TRUE from mlxconfig.

Keywords: Virtio-net; hotplug; legacy

Fixed in version: 3.9.6


Description: Virtio-net full emulation is not supported in CentOS 8.2 with inbox-kernel 4.18.0-193.el8.aarch64.

Keywords: Virtio-net; CentOS

Fixed in version: 3.9.6


Description: The Arm side responsibility is to set the Arm boot progress GPIOs to 5 to indicate that Linux is up.

Keywords: Boot; GPIO

Fixed in version: 3.9.6


Description: Fixed an issue where trying live migration between two servers back and forth a few times, virtio-net-controller crashes.

Keywords: Live migration; virtio-net

Fixed in version: 3.9.6


Description: Fixed a minor memory leakage which occurs upon attaching and detaching ports to OVS when HW offloading is enabled.

Keywords: Memory leak; OVS

Fixed in version: 3.9.6


Description: ACS is now enabled in single-port PF devices.

Keywords: Single-port DPU

Fixed in version: 3.9.6


Description: Fixed an issue where SNAP crashes when detaching a virtio-blk which has fio running on a file system.

Keywords: SNAP; virtio-blk

Fixed in version: 3.9.6


Description: Disabled some HW optimization to prevent a HW race that caused an SQ to get stuck.

Keywords: HW race; hang

Fixed in version: 3.9.6

Description: Fixed an issue where ct_state(-rpl,+trk) is recognized as ct_state(+rpl+trk).

Keywords: Connection tracking

Fixed in version: 3.9.6


Description: Fixed an issue where SNAP queries BDF regardless of controller creation.

Keywords: SNAP

Fixed in version: 3.9.6


Description: For vDPA over VFE, fixed issue where closing vDPA application with SIGHUP triggers an error flow in the device possibly causing the virtio offload accelerator in the device to hang.

Keywords: vDPA; virtio

Fixed in version: 3.9.6


Description: Fixed an issue where fio over virtio-blk performance drops with more virtio-nets traffic.


Fixed in version: 3.9.6


Description: Fixed issue where SNAP crashes due to a no check for io_ctx->spdk_channel=NULL.

Keywords: SNAP

Fixed in version: 3.9.6


Description: When reloading (ifreload) an empty /etc/network/interfaces file, the previously created interfaces are not deleted.

Keyword: HBN; unsupported NVUE commands

Fixed in version: 3.9.6


Description: Hotplug/unplug of virtio-net devices during host shutdown/bootup may result in failure to do plug/unplug.

Keyword: Virtio-net, hotplug

Fixed in version: 3.9.6


Description: If secure boot is enabled, the following error message is observed while installing Ubuntu on the DPU: ERROR: need to use capsule in secure boot mode . This message is harmless and may be safely ignored.

Keywords: Error message; installation

Fixed in version: 3.9.3


Description: When Arm reboots or crashes after sending a virtio-net unplug request, the hotplugged devices may still be present after Arm recovers. The host, however, will not see those devices.

Keywords:  Virtio-net; hotplug

Fixed in version: 3.9.3


Description: BlueField with secured BFB fails to boot up if the PART_SCHEME field is set in bf.cfg during installation.

Keywords: Installation; bf.cfg

Fixed in version: 3.9.2


Description: If the RShim service is running on an external host over the PCIe interface then, in very rare cases, a soft reset of the BlueField can cause a poisoned completion to be returned to the host. The host may treat this as a fatal error and crash.

Keywords: RShim; ATF

Fixed in version: 3.9.2


Description: Virtio-net-controller recovery may not work for a hot-plugged device because the system assigns a BDF (string identifier) of 0 for the hot-plugged device, which is an invalid value.

Keywords: Virtio-net; hotplug; recovery

Fixed in version: 3.9.0


Description: Eye-opening is not supported on 25GbE integrated-BMC BlueField-2 DPU.

Keywords: Firmware, eye-opening

Fixed in version: 3.9.0


Description: Virtio full emulation is not supported by NVIDIA® BlueField®-2 multi-host cards.

Keywords: Virtio full emulation; multi-host

Fixed in version: 3.9.0


Description: After BFB installation, Linux crash may occur with efi_call_rts messages in the call trace which can be seen from the UART console.

Keywords: Linux crash; efi_call_rts

Fixed in version: 3.9.0


Description: Relaxed ordering is not working properly on virtual functions.

Keywords: MLNX_OFED; relaxed ordering; VF

Fixed in version: 3.9.0


Description: On rare occasions, the UEFI variables in UVPS EEPROM are wiped out which hangs the boot process at the UEFI menu.

Keywords: UEFI; hang

Fixed in version: 3.9.0


Description: PCIe device address to RDMA device name mapping on x86 host may change after the driver restarts in Arm.

Keywords: RDMA; Arm; driver

Fixed in version: 3.9.0


Description: RShim driver does not work when the host is in secure boot mode.

Keywords: RShim; Secure Boot

Fixed in version: 3.9.0


Description: At rare occasions d uring Arm reset o n BMC-integrated DPUs , the DPU will send "PCIe Completion" marked as poisoned. Some servers treat that as fatal and may hang.

Keywords: Arm reset; BMC integrated

Fixed in version: 3.9.0


Description: Pushing the BFB image fails occasionally with a "bad magic number" error message showing up in the console.

Keywords: BFB push; installation

Fixed in version: 3.9.0


Description: SLD detection may not function properly.

Keywords: Firmware

Fixed in version: 3.9.0


Description: External host reboot may also reboot the Arm cores if the DPU was configured using mlxconfig.

Keywords: Non-volatile configuration; Arm; reboot

Fixed in version: 3.9.0


Description: BlueField-2 may sometimes go to PXE boot instead of Linux after installation.

Keywords: Installation; PXE

Fixed in version: 3.8.5


Description: Some DPUs may get stuck at GRUB menu when booting due to the GRUB configuration getting corrupted when board is powered down before the configuration is synced to memory.

Keywords: GRUB; memory

Fixed in version: 3.8.5


Description: The available RShim logging buffer may not have enough space to hold the whole register dump which may cause buffer wraparound.

Keywords: RShim; logging

Fixed in version: 3.8.5


Description: IPMI EMU service reports cable link as down when it is actually up.

Keywords: IPMI EMU

Fixed in version: 3.8.0


Description: Virtio-net controller does not work with devices other than mlx5_0/1.

Keywords: Virtio-net controller

Fixed in version: 3.8.0


Description: No parameter validation is done for feature bits when performing hotplug.

Keywords: Virtio-net; hotplug

Fixed in version: 3.8.0


Description: When secure boot is enabled, PXE boot may not work.

Keywords: Secure boot; PXE

Fixed in version: 3.8.0


Description: Updating a BFB could fail due to congestion.

Keywords: Installation; congestion

Fixed in version: 3.8.0


Description: For virtio-net device, modifying the number of queues does not update the number of MSIX.

Keywords: Virtio-net; queues

Fixed in version: 3.8.0


Description: A "double free" error is seen when using the "curl" utility. This happens only when OpenSSL is configured to use a dynamic engine (e.g. Bluefield PKA engine).

Keywords: OpenSSL; curl

Fixed in version: 3.8.0


Description: UEFI secure boot enables the kernel lockdown feature which blocks access by mstmcra.

Keywords: Secure boot

Fixed in version: 3.8.0


Description: Virtio-net controller may fail to start after power cycle.

Keywords: Virtio-net controller

Fixed in version: 3.8.0


Description: Memory consumed for a representor exceeds what is necessary making scaling to 504 SF's not possible.

Keywords: Memory

Fixed in version: 3.8.0


Description: Modifying VF bits yields an error.

Keywords: Virtio-net controller

Fixed in version: 3.8.0


Description: Arm hangs when user is thrown to livefish by FW (e.g. secure boot).

Keywords: Arm; livefish

Fixed in version: 3.8.0


Description: The current installation flow requires multiple resets after booting the self-install BFB due to the watchdog being armed after capsule update.

Keywords: Reset; installation

Fixed in version: 3.8.0


Description: Power-off of BlueField shows up as a panic which is then stored in the RShim log and carried into the BERT table in the next boot which is misleading to the user.

Keywords: RShim; log; panic

Fixed in version: 3.8.0


Description: Various errors related to the UPVS store running out of space are observed.

Keywords: UPVS; errors

Fixed in version: 3.8.0


Description: oob_net0 cannot receive traffic after a network restart.

Keywords: oob_net0

Fixed in version: 3.8.0


Description: Up to 31 hot-plugged virtio-net devices are supported even if PCI_SWITCH_EMULATION_NUM_PORT=32. Host may hang if it hot plugs 32 devices.

Keywords: Virtio-net; hotplug

Fixed in version: 3.8.0


Description: Working with CentOS 7.6, if SF network interfaces are statically configured, the following parameters should be set.



For example:


# cat /etc/sysconfig/network-scripts/ifcfg-p0m0 NAME=p0m0 DEVICE=p0m0 NM_CONTROLLED="no" PEERDNS="yes" ONBOOT="yes" BOOTPROTO="static" IPADDR= BROADCAST= NETMASK= NETWORK= TYPE=Ethernet DEVTIMEOUT=30

Keywords: CentOS; subfunctions; static configuration

Fixed in version: 3.7.0


Description: When shared RQ mode is enabled and offloads are disabled, running multiple UDP connections from multiple interfaces can lead to packet drops.

Keywords: Offload; shared RQ

Fixed in version: 3.7.0


Description: When OVS-DPDK and LAG are configured, the kernel driver drops the LACP packet when working in shared RQ mode.

Keywords: OVS-DPDK; LAG; LACP; shared RQ

Fixed in version: 3.7.0


Description: The gpio-mlxbf2 and mlxbf-gige drivers are not supported on 4.14 kernel.

Keywords: Drivers; kernel

Fixed in version: 3.7.0


Description: Virtio-net-controller does not function properly after changing uplink representor MTU.

Keywords: Virtio-net controller; MTU

Fixed in version: 3.7.0


Description: VXLAN with IPsec crypto offload does not work.

Keywords: VXLAN; IPsec crypto

Fixed in version: 3.7.0


Description: Address Translation Services is not supported in BlueField-2 step A1 devices. Enabling ATS can cause server hang.

Keywords: ATS

Fixed in version: 3.7.0


Description: PHYless reset on BlueField-2 devices may cause the device to disappear.

Keywords: PHY; firmware reset

Fixed in version: 3.7.0


Description: When working with strongSwan 5.9.0bf, running ip xfrm state show returns partial information as to the offload parameters, not showing "mode full".

Keywords: strongSwan; ip xfrm; IPsec

Fixed in version: 3.7.0


Description: Server crashes after configuring PCI_SWITCH_EMULATION_NUM_PORT to a value higher than the number of PCIe lanes the server supports.

Keywords: Server; hang

Fixed in version: 3.7.0


Description: Loading/reloading NVMe after enabling VirtIO fails with a PCI bar memory mapping error.

Keywords: VirtIO; NVMe

Fixed in version: 3.7.0


Description: When working with OVS in the kernel and using Connection Tracking, up to 500,000 flows may be offloaded.

Keywords: DPU; Connection Tracking

Fixed in version: 3.7.0


Description: If the Linux OS running on the host connected to the BlueField DPU has a kernel version lower then 4.14, MLNX_OFED package should be installed on the host.

Keywords: Host OS

Fixed in version: 3.7.0


Description: During heavy traffic, ARP reply from the other tunnel endpoint may be dropped. If no ARP entry exists when flows are offloaded, they remain stuck on the slow path.

Workaround: Set a static ARP entry at the BlueField Arm to VXLAN tunnel endpoints.

Keywords: ARP; Static; VXLAN; Tunnel; Endpoint

Fixed in version: 3.7.0


Description: During boot, the system enters systemctl emergency mode due a corrupt root file system.

Keywords: Boot

Fixed in version:


Description: Creating a bond via NetworkManager and restarting the driver (openibd restart) results in no pf0hpf and bond creation failure.

Keywords: Bond; LAG; network manager; driver reload

Fixed in version:


Description: Only up to 62 host virtual functions are currently supported.

Keywords: DPU; SR-IOV

Fixed in version:


Description: Before changing SR-IOV mode or reloading the mlx5 drivers on IPsec-enabled systems, make sure all IPsec configurations are cleared by issuing the command ip x s f && ip x p f.

Keywords: IPsec; SR-IOV; driver

Fixed in version:


Description: In Ubuntu, during or after a reboot of the Arm, manually, or as part of a firmware reset, the network devices may not transition to switchdev mode. No device representors would be created (pf0hpf, pf1hpf, etc). Driver loading on the host will timeout after 120 seconds.

Keywords: Ubuntu; reboot; representors; switchdev

Fixed in version:


Description: EEPROM storage for UEFI variables may run out of space and cause various issues such as an inability to push new BFB (due to timeout) or exception when trying to enter UEFI boot menu.

Keywords: BFB install; timeout; EEPROM UEFI Variable; UVPS

Fixed in version:


Description: When using OpenSSL on BlueField platforms where Crypto support is disabled, the following errors may be encountered:

PKA_ENGINE: PKA instance is invalid

PKA_ENGINE: failed to retrieve valid instanceThis happens due to OpenSSL configuration being linked to use PKA hardware, but that hardware is not available since crypto support is disabled on these platforms.

Keywords: PKA; Crypto

Fixed in version:


Description: All NVMe emulation counters (Ctrl, SQ, Namespace) return "0" when queried.

Keywords: Emulated devices; NVMe

Fixed in version:


Description: Multi-APP QoS is not supported when LAG is configured.

Keywords: Multi-APP QoS; LAG

Fixed in version:


Description: When creating a large number of VirtIO VFs, hung task call traces may be seen in the dmesg.

Keywords: VirtIO; call traces; hang

Fixed in version:


Description: Only up to 60 virtio-net emulated virtual functions are supported if LAG is enabled.

Keywords: Virtio-net; LAG

Fixed in version:


Description: On rare occasions, rebooting the BlueField DPU may result in traffic failure from the x86 host.

Keywords: Host; Arm

Fixed in version:


Description: When emulated PCIe switch is enabled, and more than 8 PFs are enabled, the BIOS boot process might halt.

Keywords: Emulated PCIe switch

Fixed in version:


Description: During boot, the system enters systemctl emergency mode due a corrupt root file system.

Keywords: Boot

Fixed in version:


Description: With the OCP card connecting to multiple hosts, one of the hosts could have the RShim PF exposed and probed by the RShim driver.

Keywords: RShim; multi-host

Fixed in version:


Description: When moving to separate mode on the DPU, the OVS bridge remains and no ping is transmitted between the Arm cores and the remote server.

Keywords: SmartNIC; operation modes

Fixed in version:


Description: Pushing the BFB image v3.5 with a WinOF-2 version older than 2.60 can cause a crash on the host side.

Keywords: Windows; RShim

Fixed in version:

© Copyright 2024, NVIDIA. Last updated on Jul 4, 2024.