image image image image image

You are not viewing documentation for the latest version of this software.

On This Page

Remote procedure call (RPC) protocol defines a few data types and commands. NVMe/virtio-blk SNAP, like other standard SPDK applications, supports JSON-based RPC protocol commands to control any resources and create, delete, query, or modify commands easily from CLI.

SNAP 4.x supports all standard SPDK RPC commands in addition to an extended SNAP-specific command set. SPDK standard commands are executed by the standard spdk_rpc.py tool while the SNAP-specific command set extension is executed by an equivalent snap_rpc.py tool.

Full spdk_rpc.py command set documentation can be found in the SPDK official documentation site.

Full snap_rpc.py extended commands are detailed further down in this chapter.

Using JSON-based RPC Protocol

The JSON-based RPC protocol can be used via the snap_rpc.py script that is inside the SNAP container and crictl tool:

  1. Query the active container ID using:

    crictl ps

    Or using:

    crictl ps -s running -q --name snap
  2. To post RPCs to the container using crictl:

    crictl exec -it <container-id> snap_rpc.py <RPC-method>

    For example:

    crictl exec -it 0379ac2c4f34c snap_rpc.py emulation_function_list
  3. In addition, an alias can be used if 1 container is running:

    alias snap_rpc.py="crictl ps -s running -q --name snap | xargs -I{} crictl exec -i {} snap_rpc.py " 
    alias spdk_rpc.py="crictl ps -s running -q --name snap | xargs -I{} crictl exec -i {} spdk_rpc.py " 

PCIe Function Management

Emulated PCIe functions are managed through IB devices called emulation managers. Emulation managers are ordinary IB devices with special privileges to control PCIe communication and device emulations towards the host OS.

SNAP 4.x queries an emulation manager that supports the requested set of capabilities.

The emulation manager holds a list of the emulated PCIe functions it controls. PCIe functions may be approached later in 3 ways:

emulation_function_list

emulation_function_list lists all existing functions. 

The following is an example response for the emulation_function_list command:

[
  {
      "hotplugged": false,
      "emulation_type": "VBLK",
      "pf_index": 0,
      "pci_bdf": "27:00.4",
      "vhca_id": 4,
      "vuid": "MT2142X08235VBLKS0D0F4"
    }  
]

Use  -a or --all, to show all inactive VF functions.

SNAP supports 2 types of PCIe functions:

  • Static functions – PCIe functions configured at the firmware configuration stage (physical and virtual). Refer to appendix "DPU Firmware Configuration" for additional information.
  • Hotplugged functions – PCIe functions configured dynamically at runtime. Users can add detachable functions. Refer to section "Hotplugged PCIe Functions Management" for additional information.

Hotplugged PCIe Functions Management

Hotplug PCIe functions are configured dynamically at runtime using RPCs. 

The following commands hot plug a new PCIe function to the system:

CommandDescription
virtio_blk_emulation_device_attachAttach virtio-blk emulation function
nvme_emulation_device_attachAttach NVMe emulation function

Hotplug PFs do not support SR-IOV.

virtio_blk_emulation_device_attach

Attach virtio-blk emulation function.

Command parameters:

ParameterMandatory?TypeDescription

id

No

Number

Device ID

vid

No

Number

Vendor ID

ssid

No

Number

Subsystem device ID

ssvid

No

Number

Subsystem vendor ID

revid

No

Number

Revision ID

class_code

No

Number

Class code

num_msix

No

Number

MSI-X vector size

total_vf

No

Number

Maximal number of VFs allowed

bdev

No

String

Block device to use as backend

num_queues

No

Number

Number of IO queues (default 1, range 1-62).

The actual number of queues is limited by the number of queues supported by firmware.

If a driver using MSIX interrupts is used, the number of MSIX must be greater than the number of IO queues (1 is used for the config interrupt).

queue_depth

No

Number

Queue depth

It is only possible to modify the queue depth if the driver is not loaded.

transitional_deviceNoBooleanTransitional device support. See section "VirtIO-blk Transitional Device Support" for more details.

nvme_emulation_device_attach

Attach NVMe emulation function.

Command parameters:

ParameterMandatory?TypeDescription

id

No

Number

Device ID

vid

No

Number

Vendor ID

ssid

No

Number

Subsystem device ID

ssvid

No

Number

Subsystem vendor ID

revid

No

Number

Revision ID

class_code

No

Number

Class code

num_msix

No

Number

MSI-X vector size

total_vf

No

Number

Maximal number of VFs allowed

num_queues

No

Number

Number of IO queues (default 31, range 1-31).

The actual number of queues is limited by the number of queues supported by firmware.

If a driver using MSIX interrupts is used, the number of MSIX must be greater than the number of IO queues (1 is used for the admin queue).

versionNoStringSpecification version (currently only 1.4 is supported)

Hot Unplug

The following commands hot-unplug a PCIe function from the system in 2 steps:


CommandDescription
1

emulation_device_detach_prepare

Prepare emulation function to be detached

2

emulation_device_detach

Detach emulation function

emulation_device_detach_prepare

This is the first step for detaching an emulation device. It prepares the system to detach a hot plugged emulation function. In case of success, the host's hotplug device state is POWER_OFF and you may safely proceed to emulation_device_detach.

A control must be attached to the emulation function before calling this command.

Command parameters:

ParameterMandatory?TypeDescription

pci_bdf

No

String

PCIe device BDF

vhca_id

No

Number

VHCA ID of PCIe function

vuidNoStringPCIe device VUID

At least one identifier must be provided to describe the PCIe function to be detached.

emulation_device_detach

This is the second step which completes detaching of the hotplugged emulation function. If the detach preparation times out, you may perform a surprise unplug using --force with the command. 

The driver must be unprobed, otherwise errors may occur.

Command parameters:

ParameterMandatory?TypeDescription

pci_bdf

No

String

PCIe device BDF

vhca_id

No

Number

VHCA ID of PCIe function

vuidnoStringPCIe device VUID

force

No

Boolean

Detach with failed preparation

At least one identifier must be provided to describe the PCIe function to be detached.

Virtio-blk Hot Plug/Unplug Example

snap_rpc.py virtio_blk_emulation_device_attach
snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1
snap_rpc.py emulation_device_detach_prepare --vuid MT2114X12200VBLKS1D0F0
snap_rpc.py emulation_device_detach --vuid MT2114X12200VBLKS1D0F0

Notes

  • Once a PCIe function is unplugged from the host system its controller is implicitly deleted as well
  • After a new PCIe function is plugged, it is shown on the host's PCIe devices list until it is either explicitly unplugged or the system goes through a cold reboot. A hot-plugged PCIe function remains persistent even after SNAP process termination.
  • Some OSs automatically start to communicate with the new function after it is plugged and some continue to communicate with the function (for a certain time) even after it is signaled to be unplugged. Therefore, users must always keep an open controller (of a matching type) over any existing configured PCIe function.
  • Hotplug PFs do not support SR-IOV.

SPDK Bdev Management

SNAP 4.x uses SPDK block device framework as a backend for its NVMe namespaces/VBLK controllers. Therefore, the SPDK bdev should be configured in advance.

For more information about SPDK block devices, see SPDK bdev documentation and Appendix SPDK Configuration.

SNAP 4.x holds additional instances of bdevs, SNAP bdevs, which are managed using RPCs. After creating an SPDK block device, expose the bdevs to SNAP using the SNAP bdevs' management RPCs.

The order in which SNAP should be configured is as follows:

  1. Create SPDK bdev.
  2. Create SNAP bdev.
  3. Create SNAP controllers.

Bdev Management Commands

spdk_bdev_create

ParameterMandatory?TypeDescription

bdev

Yes

String

Block device name

spdk_bdev_destroy

ParameterMandatory?TypeDescription

bdev

Yes

String

Block device name

bdev_list

Example response:

[  
    {
      "name": "nvme0n1",
      "block_size": 512,
      "block_count": 131072,
      "uuid": "dfe468c8-c15d-4ea9-93d3-6b8ef8ed6b36",
      "transport": "rdma_zc"
    }
]

Notes

  • If the spdk_bdev_destroy has a bdev that is already attached (i.e., in use), the RPC fails.

  • SNAP 4.x supports bdev remove and resize events:
    • In case of a bdev remove event, SNAP 4.x detaches the bdev from the attached NVMe namespaces/VBLK controllers and deletes the SNAP 4.x bdev
    • In case of a bdev resize event, SNAP 4.x updates the new size of the SNAP 4.x bdevs

Virtio-blk Emulation Management

Virtio-blk emulation is a storage protocol belong to the virtio family of devices. These devices are found in virtual environments yet by design look like physical devices to the user within the virtual machine.

Each virtio-blk device (e.g., virtio-blk PCIe entry) exposed to the host, whether it is PF or VF, must be backed by a virtio-blk controller.

Virtio-blk limitations:

  • Probing a virtio-blk driver on the host without an already functioning virtio-blk controller may cause the host to hang until such controller is opened successfully (no timeout mechanism exists).
  • Upon creation of a virtio-blk controller, a backend device must already exist.

Virtio-Blk Emulation Management Commands

CommandDescription

virtio_blk_controller_create

Create new virtio-blk SNAP controller

virtio_blk_controller_destroy

Destroy virtio-blk SNAP controller

virtio_blk_controller_suspend

Suspend virtio-blk SNAP controller

virtio_blk_controller_resume

Resume virtio-blk SNAP controller

virtio_blk_controller_bdev_attachAttach bdev to virtio-blk SNAP controller

virtio_blk_controller_bdev_detach

Detach bdev from virtio-blk SNAP controller

virtio_blk_controller_list

Virtio-blk SNAP controller list

virtio_blk_controller_dbg_stats_get

TBD
virtio_blk_controller_state_saveSave state of the suspended virtio-blk SNAP controller
virtio_blk_controller_state_restoreRestore state of the suspended virtio-blk SNAP controller
virtio_blk_controller_vfs_msix_reclaimReclaim virtio-blk SNAP controller VFs MSIX for the free MSIX pool. Valid only for PFs.

virtio_blk_controller_create

Create a new SNAP-based virtio-blk controller over a specific PCIe function on the host. To specify the PCIe function to open a controller upon must be provided as described in section "PCIe Function Management":

  1. vuid (recommended as it is guaranteed to remain constant).
  2. vhca_id.
  3. Function index – pf_id, vf_id.

The mapping for pci_index can be queried by running emulation_function_list.

Command parameters:

ParameterMandatory?TypeDescription

vuid

No

String

PCIe device VUID

vhca_id

No

Number

VHCA ID of PCIe function

pf_id

No

Number

PCIe PF index to start emulation on

vf_id

No

Number

PCIe VF index to start emulation on (if the controller is meant to be opened on a VF)

pci_bdfNoStringPCIe device BDF
ctrlNoStringController ID

num_queues

No

Number

Number of IO queues (default 1, range 1-62).

The actual number of queues is limited by the number of queues supported by firmware.

If a driver using MSIX interrupts is used, the number of MSIX must be greater than the number of IO queues (1 is used for the config interrupt).

queue_size

No

Number

Queue depth (default 256, range 1-256)

size_max

No

Number

Maximal SGE data transfer size (default 4096, range 1–MAX_UINT16)

seg_max

No

Number

Maximal SGE list length (default 1, range 1-queue_depth)

bdev

No

String

SNAP SPDK block device to use as backend

vblk_id

No

String

Serial number for the controller

admin_qNo0/1Enables live migration and NVIDIA vDPA
dynamic_msixNo0/1Dynamic MSIX for SR-IOV VFs on this PF. Only valid for PFs.
vf_num_msixNoNumberNumber of MSIX for this VF. Root PF must have dynamic MSIX configured.
force_in_orderNo0/1Support virtio-blk crash recovery. Enabling this parameter to 1 may impact virtio-blk performance (default is 0).

Example response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": "VblkCtrl1"
}

virtio_blk_controller_destroy

Destroy a previously created virtio-blk controller. The controller can be uniquely identified by the controller name as acquired from virtio_blk_controller_create().

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

virtio_blk_controller_suspend

While suspended, the controller stops receiving new requests from the host driver and only finishes handling of requests already in flight. All suspended requests (if any) are processed after resume.

The controller can be suspended only if the host driver is up.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

virtio_blk_controller_resume

After the controller stops receiving new requests from the host driver (i.e., is suspended) and only finishes handling of requests already in flight, the resume command will resume the handling of IOs by the controller.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

virtio_blk_controller_bdev_attach

Attach the specified bdev into virtIO-blk SNAP controller.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

bdevYesStringBlock device name

virtio_blk_controller_bdev_detach

You may replace the bdev for virtio-blk controller. First, you should detach bdev from the controller. When bdev is detached, the controller stops receiving new requests from the host driver (i.e., is suspended) and finishes handling requests already in flight only.

At this point, you may attach a new bdev or destroy the controller.

When a new bdev is attached, the controller resumes handling all outstanding I/Os.

The block size cannot be changed if the driver is loaded.

bdev may be replaced with a different block size if the driver is not loaded.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

virtio_blk_controller_list

List virtio-blk SNAP controller.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

No

String

Controller name

Example response:

{
   {
    "ctrl id": "VblkCtrl1",
    "vuid": "MT2114X12200VBLKS1D0F4",
    "vhca id": 4,
    "num queues": 1,
    "queue size": 256,
    "bdev": "Null0"
  }
}

virtio_blk_controller_dbg_stats_get

Debug counters are per-controller I/O stats that can help knowing the I/O distribution between different queues of the controller and the total I/O received on the controller.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

Example response:

"ctrl_id": "VblkCtrl2",
"queues": [
  {
    "queue_id": 0,
    "core_id": 0,
    "read_io_count": 19987068,
    "write_io_count": 6319931,
    "flush_io_count": 0
  },
  {
    "queue_id": 1,
    "core_id": 1,
    "read_io_count": 9769556,
    "write_io_count": 3180098,
    "flush_io_count": 0
  }
],
"read_io_count": 29756624,
"write_io_count": 9500029,
"flush_io_count": 0
    }

virtio_blk_controller_state_save

Save the state of the suspended virtio-blk SNAP controller.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

file_name

Yes

String

Filename to save state to

virtio_blk_controller_state_restore

Restore the state of the suspended virtio-blk SNAP controller.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

file_name

Yes

String

Filename to save state to

virtio_blk_controller_vfs_msix_reclaim

Reclaim virtio-blk SNAP controller VFs MSIX back to the free MSIX pool. Valid only for PFs.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

Virtio-blk Configuration Examples

Virtio-blk Configuration for Single Controller

spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage
snap_rpc.py spdk_bdev_create nvme0n1
snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1

Virtio-blk Dynamic Configuration For 125 VFs

  1. Update the firmware configuration as described section "SR-IOV Firmware Configuration".

  2. Reboot the host.
  3. Run:

    [dpu] spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage
    [dpu] snap_rpc.py spdk_bdev_create nvme0n1
    [dpu] snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0
    
    [host] modprobe -v virtio-pci && modprobe -v virtio-blk
    [host] echo 125 > /sys/bus/pci/devices/0000:86:00.3/sriov_numvfs
    
    [dpu] for i in `seq 0 124`; do snap_rpc.py virtio_blk_controller_create --pf_id 0 --vf_id $i --bdev nvme0n1; done;

Virtio-blk Suspend, Resume Example

[host] // Run fio
[dpu] snap_rpc.py virtio_blk_controller_suspend -C VBLKCtrl1
[host] // IOs will get suspended
[dpu] snap_rpc.py virtio_blk_controller_resume -C VBLKCtrl1
[host] // fio will resume sending IOs

Virtio-blk Bdev Attach, Detach Example

[host] // Run fio
[dpu] snap_rpc.py virtio_blk_controller_bdev_detach -c VBLKCtrl1
[host] // Bdev will be detached and IOs will get suspended
[dpu] snap_rpc.py virtio_blk_controller_bdev_attach -c VBLKCtrl1 --bdev null2
[host] // The null2 bdev will be attached into controller and fio will resume sending IOs

Notes

  • Virtio-blk protocol supports one backend device only
  • Virtio-blk protocol does not support administration commands to add backends. Thus, all backend attributes are communicated to the host virtio-blk driver over PCIe BAR and must be accessible during driver probing. Therefore, backends can only be changed once the PCIe function is not in use by any host storage driver.

NVMe Emulation Management

NVMe Subsystem

The NVMe subsystem as described in the NVMe specification is a logical entity which encapsulates sets of NVMe backends (or namespaces) and connections (or controllers). NVMe subsystems are extremely useful when working with multiple NVMe controllers especially when using NVMe VFs. Each NVMe subsystem is defined by its serial number (SN), model number (MN), and qualified name (NQN) after creation.

The RPCs listed in this section control the creation and destruction of NVMe subsystems.

NVMe Namespace

NVMe namespaces are the representors of a continuous range of LBAs in the local/remote storage. Each namespace must be linked to a subsystem and have a unique identifier (NSID) across the entire NVMe subsystem (e.g., 2 namespaces cannot share the same NSID even if they are linked to different controllers).

After creation, NVMe namespaces can be attached to a controller.

The SNAP application uses an SPDK block device framework as a backend for its NVMe namespaces. Therefore, they should be configured in advance. For more information about SPDK block devices, see SPDK bdev documentation and Appendix SPDK Configuration.

NVMe Controller

Each NVMe device (e.g., NVMe PCIe entry) exposed to the host, whether it is a PF or VF, must be backed by NVMe controller, which is responsible for all protocol communication with the host's driver.

Every new NVMe controller must also be linked to an NVMe subsystem. After creation, NVMe controllers can be addressed using either their name (e.g., "Nvmectrl1") or both their subsystem NQN and controller ID.

Attaching NVMe Namespace to NVMe Controller

After creating an NVMe controller and an NVMe namespace under the same subsystem, the following method is used to attach the namespace to the controller.

NVMe Emulation Management Command

CommandDescription

nvme_subsystem_create

Create NVMe subsystem

nvme_subsystem_destroy

Destroy NVMe subsystem

nvme_subsystem_list

NVMe subsystem list

nvme_namespace_create

Create NVMe namespace

nvme_namespace_destroy

Destroy NVMe namespace

nvme_controller_suspendSuspend NVMe controller
nvme_controller_resumeResume NVMe controller
nvme_controller_snapshot_getTake snapshot of NVMe controller to a file

nvme_namespace_list

NVMe namespace list

nvme_controller_create

Create new NVMe controller

nvme_controller_destroy

Destroy NVMe controller

NVMe controller list

nvme_controller_attach_ns

Attach NVMe namespace to controller

nvme_controller_detach_ns

Detach NVMe namespace from controller

nvme_controller_vfs_msix_reclaimReclaim NVMe SNAP controller VFs MSIX back to free MSIX pool. Valid only for PFs.
nvme_controller_dbg_stats_getTBD

nvme_subsystem_create

Create a new NVMe subsystem to be controlled by one or more NVMe SNAP controllers. An NVMe subsystem includes one or more controllers, zero or more namespaces, and one or more ports. An NVMe subsystem may include a non-volatile memory storage medium and an interface between the controller(s) in the NVMe subsystem and non-volatile memory storage medium.

Command parameters:

ParameterMandatory?TypeDescription

nqn

Yes

String

Subsystem qualified name

serial_number

No

String

Subsystem serial number

model_number

No

String

Subsystem model number

nn

No

Number

Maximal namespace ID allowed in the subsystem (default 0xFFFFFFFE; range 1-0xFFFFFFFE)

mnan

No

Number

Maximal number of namespaces allowed in the subsystem (default 1024; range 1-0xFFFFFFFE)

Example request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "nvme_subsystem_create",
  "params": {
    "nqn": "nqn.2022-10.io.nvda.nvme:0"
  }
}

nvme_subsystem_destroy

Destroy (previously created) NVMe SNAP subsystem.

Command parameters:

ParameterMandatory?TypeDescription

nqn

Yes

String

Subsystem qualified name

forceNoBoolForce the deletion of all the controllers and namespaces under the subsystem

nvme_subsystem_list

List NVMe subsystems.

nvme_namespace_create

Create new NVMe namespaces that represent a continuous range of LBAs in the previously configured bdev. Each namespace must be linked to a subsystem and have a unique identifier (NSID) across the entire NVMe subsystem.

Command parameters:

ParameterMandatory?TypeDescription

nqn

Yes

String

Subsystem qualified name

bdev_name

Yes

String

SPDK block device to use as backend

nsid

Yes

Number

Namespace ID

uuid

No

Number

Namespace UUID

To safely detach/attach namespaces, the UUID should be provided to force the UUID to remain persistent.

nvme_namespace_destroy

Destroy a previously created NVMe namespaces.

Command parameters:

ParameterMandatory?TypeDescription

nqn

Yes

String

Subsystem qualified name

nsidYesNumberNamespace ID

nvme_namespace_list

List NVMe SNAP namespaces.

Command parameters:

ParameterMandatory?TypeDescription

nqn

No

String

Subsystem qualified name

nvme_controller_create

Create a new SNAP-based NVMe blk controller over a specific PCIe function on the host.

To specify the PCIe function to open the controller upon, pci_index must be provided.

The mapping for pci_index can be queried by running emulation_function_list.

Command parameters:

ParameterMandatory?TypeDescription

nqn

Yes

String

Subsystem qualified name

vuid

No

Number

VUID of PCIe function

pf_id

No

Number

PCIe PF index to start emulation on

vf_id

No

Number

PCIe VF index to start emulation on (if the controller is destined to be opened on a VF)

pci_bdfNoStringPCIe BDF to start emulation on

vhca_id

No

Number

VHCA ID of PCIe function

ctrl

No

Number

Controller ID

num_queues

No

Number

Number of IO queues (default 31, range 1-31).

The actual number of queues is limited by the number of queues supported by firmware.

If a driver using MSIX interrupts is used, the number of MSIX must be greater than the number of IO queues (1 is used for the admin queue).

mdtsNoNumberMDTS (default 7, range 1-7)

fw_slots

No

Number

Maximum number firmware slots (default 4)

write_zeroesNo0/1Enable the write_zeroes optional NVMe command
compareNo0/1Set the value of the compare support bit in the controller
compare_writeNo0/1

Set the value of the compare_write support bit in the controller

During crash recovery, all compare and write commands are expected to fail.

deallocate_dsmNo0/1Set the value of the dsm (dataset management) support bit in the controller. The only dsm request currently supported is deallocate.
suspendedNo0/1Open the controller in suspended state (requires an additional call to nvme_controller_resume before it becomes active)
snapshotNoStringCreate a controller out of a snapshot file path. Snapshot is previously taken using nvme_controller_snapshot_get.
dynamic_msixNo0/1Enable dynamic MSIX management for the controller (default 0). Applies only for PFs.
vf_num_msixNoNumberControl the number of MSIX vectors to associate with this controller. Valid only for VFs and only when their parent PF controller is created using the --dynamic_msix option.
quirksNoNumber

Bitmask to support buggy drivers which are non-compliant per NVMe specification.

  • Bit 0 – send "Namespace Attribute Changed" async event, even though it is disabled by the driver during "Set Features" command
  • Bit 1 – keep sending "Namespace Attribute Changed" async events, even when "Changed Namespace List" Get Log Page has not arrived from driver
  • Bit 2 – reserved
  • Bit 3 – force-enable "Namespace Management capability" NVMe OACS even though it is not supported by the controller

For more details, see section "OS Issues".

If not set, the SNAP NVMe controller supports an optional NVMe command only if all the namespaces attached to it when loading the driver support it. To bypass this feature, you may explicitly set the NVMe optional command support bit by using its corresponding flag.

For example, a controller created with –-compare 0 would not support the optional compare NVMe command regardless of its attached namespaces.

Example request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "nvme_controller_create",
  "params": {
    "nqn": "nqn.2022-10.io.nvda.nvme:0",
    "pf_id": 0,
    "num_queues": 8,
  }
}

nvme_controller_destroy

Destroy a previously created NVMe controller. The controller can be uniquely identified by a controller name as acquired from nvme_controller_create.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

release_msixNo1/0Release MSIX back to free pool. Applies only for VFs.

nvme_controller_suspend

While suspended, the controller stops handling new requests from the host driver. All pending requests (if any) will be processed after resume.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

timeout_msNoNumberSuspend timeout

nvme_controller_resume

The resume command continues the (previously-suspended) controller's handling of new requests sent by the driver. If the controller is created in suspended mode, resume is also used to start initial communication with host driver.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

nvme_controller_snapshot_get

Take a snapshot of the current state of the controller and dump it into a file. This file may be used to create a controller based on this snapshot. For the snapshot to be consistent, users should call this function only when the controller is suspended (see nvme_controller_suspend).

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

filenameYesStringFile path

nvme_controller_vfs_msix_reclaim

Reclaims all VFs MSIX back to the PF's free MSIX pool.

This function can only be applied on PFs and can only be run when SR-IOV is not set on host side (i.e., sriov_numvfs = 0).

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

nvme_controller_list

Provide a list of all active (created) NVMe controllers with their characteristics.

Command parameters:

ParameterMandatory?TypeDescription

nqn

No

String

Subsystem qualified name

ctrlNoStringOnly search for a specific controller

nvme_controller_attach_ns

Attach a previously created NVMe namespace to given NVMe controller under the same subsystem.

The result in the response object returns true for success and false for failure.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

nsidYesNumberNamespace ID

nvme_controller_detach_ns

Detach a previously attached namespace with a given NSID from the NVMe controller.

The result in the response object returns true for success and false for failure.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

nsidYesNumberNamespace ID

nvme_controller_dbg_stats_get

Detach a previously attached namespace with a given NSID from the NVMe controller.

The result in the response object returns true for success and false for failure.

Command parameters:

ParameterMandatory?TypeDescription

ctrl

Yes

String

Controller name

nsidYesNumberNamespace ID
"ctrl_id": "NVMeCtrl2",
"queues": [
  {
    "queue_id": 0,
    "core_id": 0,
    "read_io_count": 19987068,
    "write_io_count": 6319931,
    "flush_io_count": 0
  },
  {
    "queue_id": 1,
    "core_id": 1,
    "read_io_count": 9769556,
    "write_io_count": 3180098,
    "flush_io_count": 0
  }
],
"read_io_count": 29756624,
"write_io_count": 9500029,
"flush_io_count": 0
    }

NVMe Configuration Examples

NVMe Configuration for Single Controller

[dpu] spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage
[dpu] snap_rpc.py spdk_bdev_create nvme0n1
[dpu] snap_rpc.py nvme_subsystem_create --nqn nqn.2022-10.io.nvda.nvme:0
[dpu] snap_rpc.py nvme_namespace_create -b nvme0n1 -n 1 --nqn nqn.2022-10.io.nvda.nvme:0
[dpu] snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --pf_id 0
[dpu] snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl1 -n 1;

NVMe Configuration for 125 VFs

  1. Update the firmware configuration as described section "SR-IOV Firmware Configuration".

  2. Reboot the host.
  3. Run:

    # Create a dummy controller on the parent PF
    [dpu] # snap_rpc.py nvme_subsystem_create --nqn nqn.2022-10.io.nvda.nvme:0
    [dpu] # snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --ctrl NVMeCtrl1 --pf_id 0
    
    # Create 125 Bdevs (Remote or Local), 125 NSs and 125 controllers 
    [dpu] for i in `seq 0 124`; do \
       # spdk_rpc.py bdev_null_create null$((i+1)) 64 512;
       # snap_rpc.py spdk_bdev_create null$((i+1));
       # snap_rpc.py nvme_namespace_create -b null$((i+1)) -n $((i+1)) --nqn nqn.2022-10.io.nvda.nvme:0; 
       # snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --ctrl NVMeCtrl$((i+2)) --pf_id 0 --vf_id $i;
       # snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl$((i+2)) -n $((i+1));
    done
    
    # Load the driver and configure VFs
    [host] # modprobe -v nvme
    [host] # echo 125 > /sys/bus/pci/devices/0000\:25\:00.2/sriov_numvfs

NVMe Cleanup

snap_rpc.py nvme_controller_detach_ns -c NVMeCtrl2 -n 1 
snap_rpc.py nvme_controller_destroy -c NVMeCtrl2 
snap_rpc.py nvme_namespace_destroy -n 1 --nqn nqn.2022-10.io.nvda.nvme:0 
snap_rpc.py nvme_subsystem_destroy --nqn nqn.2022-10.io.nvda.nvme:0 
snap_rpc.py spdk_bdev_destroy nvme0n1 
  • No labels