NVIDIA BlueField-2 SNAP for NVMe and Virtio-blk v3.8.0-5
NVIDIA BlueField-2 SNAP for NVMe and Virtio-blk v3.8.0-5

Appendix – Frequently Asked Questions

Please refer to chapter "mlnx_snap Installation".

Assumptions:

  • The remote target is configured with nqn "Test" and 1 namespace, and it exposes it through the 2 RDMA interfaces 1.1.1.1/24 and 2.2.2.1/24

  • The RDMA interfaces are 1.1.1.2/24 and 2.2.2.2/24

Non-offload mode configuration:

  1. Create the SPDK BDEVS. Run:

    Copy
    Copied!
                

    spdk_rpc.py bdev_nvme_attach_controller -b Nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n Test spdk_rpc.py bdev_nvme_attach_controller -b Nvme1 -t rdma -a 2.2.2.1 -f ipv4 -s 4420 -n Test

  2. Create NVMe controller. Run:

    Copy
    Copied!
                

    snap_rpc.py controller_nvme_create mlx5_0 --subsys_id 0 -c /etc/mlnx_snap/mlnx_snap.json --rdma_device mlx5_2

  3. Attach the namespace twice, one through each port. Run:

    Copy
    Copied!
                

    snap_rpc.py controller_nvme_namespace_attach -c NvmeEmu0pf0 spdk Nvme0n1 1 snap_rpc.py controller_nvme_namespace_attach -c NvmeEmu0pf0 spdk Nvme1n1 2

At this stage, you should see /dev/nvme0n1 and /dev/nvme0n2 on the host "nvme list", both of which are mapped to the same remote disk via 2 different ports.

Full-offload mode configuration:

Note

Full-offload mode currently allows users to connect to multiple remote targets in parallel (but not to the same remote target through different paths).

  1. Create 2 separate JSON full-offload configuration files (see section "Full Offload Mode"). Each describe a connection to remote target via different RDMA interface.

  2. Configure 2 separate NVMe device entries to be exposed to the host either as hot-plugged PCIe functions (see section "Runtime Configuration"), or “static” ones (see section "Firmware Configuration").

  3. Create 2 NVMe controllers, one per RDMA interface. Run:

    Copy
    Copied!
                

    snap_rpc.py subsystem_nvme_create Mellanox_NVMe_SNAP "Mellanox NVMe SNAP Controller" snap_rpc.py controller_nvme_create mlx5_0 --subsys_id 0 --pf_id 0 -c /etc/mlnx_snap/mlnx_snap_p0.json --rdma_device mlx5_2 snap_rpc.py subsystem_nvme_create Mellanox_NVMe_SNAP "Mellanox NVMe SNAP Controller" snap_rpc.py controller_nvme_create mlx5_0 --subsys_id 1 --pf_id 1 -c /etc/mlnx_snap/mlnx_snap_p1.json --rdma_device mlx5_3

    Note

    NVMe controllers may also share the same NVMe subsystem. In this case, users must make sure all namespaces in all remote targets have a distinct NSID.

At this stage, you should see /dev/nvme0n1 and /dev/nvme1n1 on the host "nvme list".

Please refer to section "Full Offload Configuration".

For more information on full offload, please refer to section "Full Offload Mode".

MLNX SNAP is natively compiled against NVIDIA's internal branch of SPDK. It is possible to work with different SPDK versions, under the following conditions:

  • mlnx-snap sources must be recompiled against the new SPDK sources

  • The new SPDK version changes do not break any external SPDK APIs

Integration process:

  1. Build SPDK (and DPDK) with shared libraries.

    Copy
    Copied!
                

    [spdk.git] ./configure --prefix=/opt/mellanox/spdk-custom --disable-tests --disable-unit-tests --without-crypto --without-fio --with-vhost --without-pmdk --without-rbd --with-rdma --with-shared --with-iscsi-initiator --without-vtune --without-isal [spdk.git] make && sudo make install [spdk.git] cp -r dpdk/build/lib/* /opt/mellanox/spdk-custom/lib/ [spdk.git] cp -r dpdk/build/include/* /opt/mellanox/spdk-custom/include/

    Note

    It is also possible to install DPDK in that directory but copying suffices.

    Note

    Only the flag with-shared is mandatory

  2. Build SNAP against the new SPDK.

    Copy
    Copied!
                

    [mlnx-snap.src] ./configure --with-snap --with-spdk=/opt/mellanox/spdk-custom --without-gtest --prefix=/usr [mlnx-snap.src] make -j8 && sudo make install

  3. Append additional custom libraries to the mlnx-snap application. Set LD_PRELOAD="/opt/mellanox/spdk/lib/libspdk_custom_library.so".

    Note

    Additional SPDK/DPDK libraries required by libspdk_custom_library.so might also need to be attached to LD_PRELOAD.

    Note

    LD_PRELOAD setting can be added to /etc/default/mlnx_snap for persistent work with the mlnx_snap system service.

  4. Run application.

NVMe protocol has an embedded support for backends (namespaces) attach/detach at runtime.

To change backend storage during runtime for NVMe, run:

Copy
Copied!
            

snap_rpc.py controller_nvme_namespace_detach -c NvmeEmu0pf0 1 snap_rpc.py controller_nvme_namespace_attach -c NvmeEmu0pf0 spdk nvme0n1 1

VirtIO-blk does not have similar support in its protocol’s specification. Therefore, detaching while running IO results in error on any IO received between the request to detach and attach.

To change backend storage at runtime for virtio-blk, run:

Copy
Copied!
            

snap_rpc.py controller_virtio_blk_bdev_detach VblkEmu0pf0 snap_rpc.py controller_virtio_blk_bdev_attach VblkEmu0pf0 spdk nvme0n1

After adding the option to work with a large number of controllers, resource considerations had to be considered. It was necessary to pay special attention to the MSIX resource, which is limited to ~1K across the whole BlueField-2 card. Therefore, new PCI functions are now opened with limited resources by default (specifically, MSIX is set to 2).

User may choose to assign more resources for a specific function, as detailed in the following:

  1. Increase the number of MSIX allowed to be assigned to a function (power-cycle may be required for changes to take effect):

    Copy
    Copied!
                

    [dpu] mlxconfig -d /dev/mst/mt41686_pciconf0 s VIRTIO_BLK_EMULATION_NUM_MSIX=63

  2. Hotplug virtio-blk PF with the increased value of MSIX.

    Copy
    Copied!
                

    [dpu] snap_rpc.py emulation_device_attach mlx5_0 virtio_blk --num_msix=63

  3. Open the controller with increased number of queues (1 queue per MSIX, and leave another free MSIX for configuration interrupts):

    Copy
    Copied!
                

    [dpu] snap_rpc.py controller_virtio_blk_create mlx5_0 --pf_id 0 --bdev_type spdk --bdev Null0 --num_queues=62

For more information, please refer to section "Performance Optimization".

© Copyright 2024, NVIDIA. Last updated on Aug 14, 2024.