Known Issues
The following are known limitations of this NVMe/virtio-blk SNAP software version.
| Ref # | Issue | 
| – | Description: NVMeTCP XLIO is currently not supported when running 64K page size kernels on the DPU Arm cores (as is the case for CentOS 8.x, Rocky 8.x, or openEuler 20.x). | 
| Workaround: N/A | |
| Keywords: 64K page size; NVMeTCP XLIO | |
| Discovered in version: 3.6.0 | |
| – | Description: When running with virtio-blk and virtio-net protocols in parallel, performance may be negatively impacted. | 
| Workaround: N/A | |
| Keywords: Performance | |
| Discovered in version: 3.7.2 | |
| 2957317 | Description: Due to an upstream kernel bug that exists in some Linux kernel distributions, the command emulation_device_detach times out, which causes any inflight traffic to hang. | 
| Workaround: It is recommended to ensure that all inflight traffic on the device is stopped before performing a hotunplug. | |
| Keywords: PCIe Hotplug | |
| Discovered in version: 3.6.0 | |
| 3046440 | Description: NVMe full-offload mode does not work properly over the first generation of BlueField SoCs | 
| Workaround: N/A | |
| Keywords: NVMe full-offload mode | |
| Discovered in version: 3.6.0 | |
| 2879262 | Description: Due to a kernel bug that exists in some Linux kernel distributions, configuring large number of virtio queues along with a small number of MSIX may lead the kernel to a soft lock-up (on top of causing significant performance degradation). | 
| Workaround: It is recommended that to keep virtio-blk controller's --num_queues value in snap_rpc.py controller_virtio_blk_create is smaller than the value of VIRTIO_BLK_EMULATION_NUM_MSIX (which is configured through mlxconfig). | |
| Keywords: Virtio-blk; kernel hang | |
| Discovered in version: 3.6.0 | |
| – | Description: SPDK multipath is supported only with NVMe over RDMA (and not with NVMe over TCP). | 
| Workaround: N/A | |
| Keywords: SPDK; NVMe | |
| Discovered in version: 3.6.0 | |
| 3055119 | Description: Windows driver does not work with Virtio-blk SNAP-Direct feature. | 
| Workaround: To disable the feature when working with Windows OS, user must set VIRTIO_BLK_SNAP_ZCOPY=0 in /etc/default/mlnx_snap. | |
| Keywords: Windows | |
| Discovered in version: 3.5.0 | |
| – | Description: NVMe multipath features cannot be obtained when using SNAP in full-offload mode configuration | 
| Workaround: N/A | |
| Keywords: NVMe full-offload mode; multipath | |
| Discovered in version: 3.4.0 | |
| – | Description: After each PCIe device hot-plug, a matching controller must be immediately opened. Specifically, hot-unplugging the device before a controller is created may cause the host kernel driver to malfunction on some Linux distributions. | 
| Workaround: N/A | |
| Keywords: Hot-plug; controller | |
| Discovered in version: 3.3.0 | |
| – | Description: SR-IOV on hot-plugged PFs is not supported | 
| Workaround: N/A | |
| Keywords: PCIe Hotplug | |
| Discovered in version: 3.2.0 | |
| – | Description: Any PCIe emulated device exposed to the host must have a matching controller opened on it in mlnx_snap service prior to loading its kernel driver. This includes virtio-net devices too. | 
| Workaround: N/A | |
| Keywords: VF; PF; virtio-net; kernel driver | |
| Discovered in version: 3.1.0 | |
| – | Description: It is not possible to attach block devices using the same nsid to different NVMe controllers which are linked to the same NVMe subsystem. For example, the following commands will result with an error as both controllers are attached with NSID 1: 
            
            
 | 
| Workaround: N/A | |
| Keywords: Block device; controller | |
| Discovered in version: 3.0.0 | |
| – | Description: mlnx_snap NVMe controller supports an admin queue with a maximum size of 1024 towards the host. | 
| Workaround: N/A | |
| Keywords: Admin queue; controller | |
| Discovered in version: 3.0.0 | |
| – | Description: The DPU expansion ROM includes NVMe and virtio-blk UEFI drivers certified by NVIDIA, which should be used by the BIOS. Any other BIOS drivers are not guaranteed to work properly. | 
| Workaround: N/A | |
| Keywords: BIOS; certified drivers | |
| Discovered in version: 3.0.0 | |
| – | Description: Legacy interrupts are not supported. | 
| Workaround: N/A | |
| Keywords: Block device; controller | |
| Discovered in version: 3.0.0 | 
The following are not BlueField SNAP limitations.
| Ref # | Issue | 
| 3543249 | Description: When using hotplugged PCIe devices, after all devices are plugged, the host must be rebooted for Windows to detect all devices. | 
| Workaround: N/A | |
| Keywords: Hotplug | |
| Discovered in version: 3.7.4 | |
| 3521378 | Description: For a successful emulation_device_detach RPC command, it is recommended to use directio=1 (O_DIRECT) with virtio-blk controller created on hot-plugged emulation. Note If directio=0 is used, the IO must be stopped manually. Otherwise, emulation_device_detach may fail. 
 | 
| Workaround: N/A | |
| Keywords: Virtio-blk; RPC | |
| Discovered in version: 3.7.4 | |
| 2957317 | Description: Setting virtio-blk emulation on bare metal will end with server crash. | 
| Workaround: Set the seg_max flag of the virtio-blk controller to at least 16 (default is 1) using the following RPC: 
            
            
 | |
| Keywords: Virtio-blk; bare-metal; seg_max | |
| Discovered in version: 3.7.2 | |
| 3056533 | Description: When using NVMe driver in Windows, if I/O is not completed for more than 120 seconds, Windows starts ignoring the NVMe device and its disks disappear. | 
| Workaround: N/A | |
| Keywords: NVMe device disappears | |
| Discovered in version: 3.6.1 | |
| N/A | Description: There is a Windows driver known issue that it may crash when attaching multiple namespaces simultaneously. Users must attach namespaces one-by-one, and verify each namespace is discovered by the OS before attaching a new one. | 
| Workaround: N/A | |
| Keywords: Attaching multiple namespaces simultaneously | |
| Discovered in version: 3.4.0 | |
| N/A | Description: There is a known Windows NVMe driver bug which causes Windows initiators to crash if the NVMe driver is started and no target is up and ready. Therefore, if users work with Windows OS on top of the emulated NVMe device, they must make sure that mlnx_snap NVMe controller is connected to the remote target before running the driver on the host side. | 
| Workaround: N/A | |
| Keywords: Windows initiators crash | |
| Discovered in version: 3.1.0 | |
| N/A | Description: There is a known Windows driver bug in which namespaces hotplug is not supported. On newer Windows builds, NVMe controller quirks must be set to 0x5. For more information, please see section "Controller Parameters". | 
| Workaround: N/A | |
| Keywords: Namespaces hotplug | |
| Discovered in version: 3.1.0 | 
| Ref # | Issue | 
| 3066750 | Description: Driver does not support PCIe function level reset (FLR). Running FLR during IO causes the IO (and kernel) to hang. | 
| Workaround: N/A | |
| Keywords: PCIe function; hang | |
| Discovered in version: 3.6.1 | |
| 2879262 | Description: When working with a large number of virtqueues (≥ 64) over a single MSIX, the host kernel might experience soft lockup. Specifically, setting --num_queues to a high number, which is also higher than the configured --num_msix value, might cause this issue. | 
| Workaround: | |
| Keywords: Kernel; hang; virtqueues | |
| Discovered in version: 3.6.1 | |
| 2957317 | Description: In Linux kernel version 5.4.0-91-generic and above, the command emulation_device_detach times out if I/O traffic is running. | 
| Workaround: N/A | |
| Keywords: Command time out | |
| Discovered in version: 3.6.1 | 
| Ref # | Issue | 
| 3231721 | Description: When using emulation_device_attach RPC to hot plug a virtio-blk transitional device, the capacity and block size attributes must be provided for this hot-plugged virtio-blk transitional device. | 
| Workaround: Use the --bdev_type spdk and --bdev spdk_bdev options to provide a bdev to the hot-plugged virtio-blk transitional device when using emulation_device_attach RPC. | |
| Keywords: Hot plugging virtio-blk transitional device | |
| Discovered in version: 3.7.0 | |
| – | Description: L egacy/transitional drivers do not require syncing with the device upon driver initialization. Therefore, it is highly recommended that the SNAP controller is opened on the PCIe function before the driver becomes operational. If the driver becomes operational before the controller, controller configuration options would be very limited. | 
| Workaround: N/A | |
| Keywords: Legacy; SNAP controller; SNAP driver | |
| Discovered in version: 3.7.0 | |
| – | Description: L egacy/transitional device support naturally includes backends with 512B block size. Using backends with any other block size (e.g., 4K) can only be achieved when SNAP controller is opened before driver is activated. | 
| Workaround: N/A | |
| Keywords: Legacy; backend block size | |
| Discovered in version: 3.7.0 |