SNAP RPC Commands
Remote procedure call (RPC) protocol is used to control the SNAP service. NVMe/virtio-blk SNAP, like other standard SPDK applications, supports JSON-based RPC protocol commands to control any resources and create, delete, query, or modify commands easily from CLI.
SNAP supports all standard SPDK RPC commands in addition to an extended SNAP-specific command set. SPDK standard commands are executed by the spdk_rpc.py
tool while the SNAP-specific command set extension is executed by the snap_rpc.py
tool.
Full spdk_rpc.py
command set documentation can be found in the SPDK official documentation site.
Full snap_rpc.py
extended commands are detailed further down in this chapter.
The JSON-based RPC protocol can be used via the snap_rpc.py
script that is inside the SNAP container and crictl
tool.
The SNAP container is CRI-compatible.
To query the active container ID:
crictl ps -s running -q --name snap
To post RPCs to the container using
crictl
:crictl exec <container-id> snap_rpc.py <RPC-method>
For example:
crictl exec 0379ac2c4f34c snap_rpc.py emulation_function_list
In addition, an alias can be used:
alias snap_rpc.py="crictl ps -s running -q --name snap | xargs -I{} crictl exec -i {} snap_rpc.py " alias spdk_rpc.py="crictl ps -s running -q --name snap | xargs -I{} crictl exec -i {} spdk_rpc.py "
To open a bash shell to the container that can be used to post RPCs:
crictl exec -it <container-id> bash
snap_log_level_set
SNAP allows dynamically changing the log level of the logger backend using the snap_log_level_set
. Any log under the requested level is shown.
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
Number |
Log level
|
Emulated PCIe functions are managed through IB devices called emulation managers. Emulation managers are ordinary IB devices with special privileges to control PCIe communication and device emulations towards the host OS.
SNAP queries an emulation manager that supports the requested set of capabilities.
The emulation manager holds a list of the emulated PCIe functions it controls. PCIe functions may be approached later in 3 ways:
vuid
– recommended as it is guaranteed to remain constant (see Appendix - PCIe BDF to VUID Translation for details)vhca_id
Function index (i.e.,
pf_id
orvf_id
)
emulation_function_list
emulation_function_list
lists all existing functions.
The following is an example response for the emulation_function_list
command:
[
{
"hotplugged": true,
"hotplug state": "POWER_ON",
"emulation_type": "VBLK",
"pf_index": 0,
"pci_bdf": "87:00.0",
"vhca_id": 5,
"vuid": "MT2306XZ009TVBLKS1D0F0",
"ctrl_id": "VblkCtrl1",
"num_vfs": 0,
"vfs": []
}
]
Use -a
or --all
, to show all inactive VF functions.
SNAP supports 2 types of PCIe functions:
Static functions – PCIe functions configured at the firmware configuration stage (physical and virtual). Refer to appendix "DPU Firmware Configuration" for additional information.
Hot-pluggable functions – PCIe functions configured dynamically at runtime. Users can add detachable functions. Refer to section "Hot-pluggable PCIe Functions Management" for additional information.
Hotplug PCIe functions are configured dynamically at runtime using RPCs. Once a new PCIe function is hot plugged, it appears in the host’s PCIe device list and remains persistent until explicitly unplugged or the system undergoes a cold reboot. Importantly, this persistence continues even if the SNAP process terminates. Therefore, it is advised not to include hotplug/hotunplug actions in automatic initialization scripts (e.g., snap_rpc_init.conf
).
Hotplug PFs do not support SR-IOV.
Two-step PCIe Hotplug
The following RPC commands are used to dynamically add or remove PCIe PFs (i.e., hot-plugged functions) in the DPU application.
Once a PCIe function is created (via virtio_blk_function_create
), it is accessible and manageable within the DPU application but is not immediately visible to the host OS/kernel. This differs from the legacy API, where creation and host exposure occurs simultaneously. Instead, exposing or hiding PCIe functions to the host OS is managed by separate RPC commands (virtio_blk_controller_hotplug
and virtio_blk_controller_hotunplug
). After hot unplugging, the function can be safely removed from the DPU (using virtio_blk_function_destroy
).
A key advantage of this approach is the ability to pre-configure a controller on the function, enabling it to serve the host driver as soon as it is exposed. In fact, users must create a controller to use the virtio_blk_controller_hotplug
API, which is required to make the function visible to the host OS.
Command |
Description |
Create a new virtio-blk emulation function |
|
Exposes (hot plugs) the emulation function to the host OS |
|
Removes (hot unplugs) the emulation function from the host OS |
|
Delete an existing virtio-blk emulation function |
virtio_blk_function_create
Create a new virtio-blk emulation function.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
String |
Emulation manager to manage hotplug function (unused) |
virtio_blk_function_destroy
Delete an existing virtio-blk emulation function.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Identifier of the hotplugged function to delete |
virtio_blk_controller_hotplug
Exposes (hot plugs) the emulation function to the host OS.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller to expose to the host OS |
|
No |
Bool |
Block until host discovers and acknowledges the new command |
|
No |
int |
Time (in msecs) to wait until giving up. Only valid when |
virtio_blk_controller_hotunplug
Removes (hot unplugs) the emulation function from the host OS.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller to expose to the host OS |
|
No |
Bool |
Block until host identifies and removes the function |
The non-legacy API is not supported yet for NVMe protocol.
When not using wait_for_done
approach, it is the user's responsibility to verify host identifies the new hotplugged function. This can be done by querying the pci_hotplug_state
parameter in emulation_function_list
RPC output.
Two-step PCIe Hotplug/Unplug Example
# Bringup
spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage
snap_rpc.py virtio_blk_function_create
snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1
snap_rpc.py virtio_blk_controller_hotplug -c VblkCtrl1
# Cleanup
snap_rpc.py virtio_blk_controller_hotunplug -c VblkCtrl1
snap_rpc.py virtio_blk_controller_destroy -c VblkCtrl1
snap_rpc.py virtio_blk_function_destroy --vuid MT2114X12200VBLKS1D0F0
spdk_rpc.py bdev_nvme_detach_controller nvme0
(Deprecated) Legacy API
Hotplug Legacy Commands
The following commands hot plug a new PCIe function to the system.
After a new PCIe function is plugged, it is immediately shown on the host's PCIe devices list until it is either explicitly unplugged or the system goes through a cold reboot. Therefore, it is user responsibility to open a controller instance to manage the new function immediately after a function's creation. Keeping a hotplugged function without a matching controller to manage may cause anomalous behavior on the host OS driver.
Command |
Description |
Attach virtio-blk emulation function |
|
Attach NVMe emulation function |
virtio_blk_emulation_device_attach
Attach virtio-blk emulation function.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
Number |
Device ID |
|
No |
Number |
Vendor ID |
|
No |
Number |
Subsystem device ID |
|
No |
Number |
Subsystem vendor ID |
|
No |
Number |
Revision ID |
|
No |
Number |
Class code |
|
No |
Number |
MSI-X table size |
|
No |
Number |
Maximal number of VFs allowed |
|
No |
String |
Block device to use as backend |
|
No |
Number |
Number of IO queues (default 1, range 1-62). Note
The actual number of queues is limited by the number of queues supported by the hardware.
Tip
It is recommended that the number of MSIX be greater than the number of IO queues (1 is used for the config interrupt).
|
|
No |
Number |
Queue depth (default 256, range 1-256) Note
It is only possible to modify the queue depth if the driver is not loaded.
|
|
No |
Boolean |
Transitional device support. See section "VirtIO-blk Transitional Device Support" for more details. |
|
No |
Boolean |
N/A – not supported |
nvme_emulation_device_attach
Attach NVMe emulation function.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
Number |
Device ID |
|
No |
Number |
Vendor ID |
|
No |
Number |
Subsystem device ID |
|
No |
Number |
Subsystem vendor ID |
|
No |
Number |
Revision ID |
|
No |
Number |
Class code |
|
No |
Number |
MSI-X table size |
|
No |
Number |
Maximal number of VFs allowed |
|
No |
Number |
Number of IO queues (default 31, range 1-31). Note
The actual number of queues is limited by the number of queues supported by the hardware.
Tip
It is recommended that the number of MSIX be greater than the number of IO queues (1 is used for the config interrupt).
|
|
No |
String |
Specification version (currently only |
Hot Unplug Legacy Commands
The following commands hot-unplug a PCIe function from the system in 2 steps:
Command |
Description |
|
1 |
Prepare emulation function to be detached |
|
2 |
Detach emulation function |
emulation_device_detach_prepare
This is the first step for detaching an emulation device. It prepares the system to detach a hot plugged emulation function. In case of success, the host's hotplug device state changes and you may safely proceed to emulation_device_detach
.
The controller attached to the emulation function must be created and active when executing this command.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
Number |
vHCA ID of PCIe function |
|
No |
String |
PCIe device VUID |
|
No |
String |
Controller ID |
At least one identifier must be provided to describe the PCIe function to be detached.
emulation_device_detach
This is the second step which completes detaching of the hotplugged emulation function. If the detach preparation times out, you may perform a surprise unplug using --force
with the command.
The driver must be unprobed, otherwise errors may occur.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
Number |
vHCA ID of PCIe function |
|
no |
String |
PCIe device VUID |
|
No |
Boolean |
Detach with failed preparation |
At least one identifier must be provided to describe the PCIe function to be detached.
Virtio-blk Hot Plug/Unplug Example
// Bringup
spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage
snap_rpc.py virtio_blk_emulation_device_attach
snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1
// Cleanup
snap_rpc.py emulation_device_detach_prepare --vuid MT2114X12200VBLKS1D0F0
snap_rpc.py virtio_blk_controller_destroy -c VblkCtrl1
snap_rpc.py emulation_device_detach --vuid MT2114X12200VBLKS1D0F0
spdk_rpc.py bdev_nvme_detach_controller nvme0
The following RPCs are deprecated and are no longer supported:
spdk_bdev_create
spdk_bdev_destroy
bdev_list
These RPCs were optional. If not performed, SNAP would automatically generate SNAP block devices (bdevs).
Virtio-blk emulation is a storage protocol belonging to the virtio family of devices. These devices are found in virtual environments yet by design look like physical devices to the user within the virtual machine.
Each virtio-blk device (e.g., virtio-blk PCIe entry) exposed to the host, whether it is PF or VF, must be backed by a virtio-blk controller.
Virtio-blk limitations:
Probing a virtio-blk driver on the host without an already functioning virtio-blk controller may cause the host to hang until such controller is opened successfully (no timeout mechanism exists).
Upon creation of a virtio-blk controller, a backend device must already exist.
Virtio-blk Emulation Management Commands
Command |
Description |
Create new virtio-blk SNAP controller |
|
Destroy virtio-blk SNAP controller |
|
Suspend virtio-blk SNAP controller |
|
Resume virtio-blk SNAP controller |
|
Attach bdev to virtio-blk SNAP controller |
|
Detach bdev from virtio-blk SNAP controller |
|
Virtio-blk SNAP controller list |
|
Virtio-blk controller parameters modification |
|
Get virtio-blk SNAP controller IO stats |
|
Get virtio-blk SNAP controller debug stats |
|
Save state of the suspended virtio-blk SNAP controller |
|
Restore state of the suspended virtio-blk SNAP controller |
|
Reclaim virtio-blk SNAP controller VFs MSIX for the free MSIX pool. Valid only for PFs. |
virtio_blk_controller_create
Create a new SNAP-based virtio-blk controller over a specific PCIe function on the host. To specify the PCIe function to open a controller upon must be provided as described in section "PCIe Function Management":
vuid
(recommended as it is guaranteed to remain constant).vhca_id.
Function index –
pf_id
,vf_id
.
The mapping for pci_index
can be queried by running emulation_function_list
.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
String |
PCIe device VUID |
|
No |
Number |
vHCA ID of PCIe function |
|
No |
Number |
PCIe PF index to start emulation on |
|
No |
Number |
PCIe VF index to start emulation on (if the controller is meant to be opened on a VF) |
|
No |
String |
PCIe device BDF |
|
No |
String |
Controller ID |
|
No
|
Number |
Number of IO queues (default 1, range 1-64). Tip
It is recommended that the number of MSIX be greater than the number of IO queues (1 is used for the config interrupt). Based on effective
|
|
No |
Number |
Queue depth (default 256, range 1-256) |
|
No |
Number |
Maximal SGE data transfer size (default 4096, range 1– |
|
No |
Number |
Maximal SGE list length (default 1, range 1- |
|
No |
String |
SNAP SPDK block device to use as backend |
|
No |
String |
Serial number for the controller |
|
No |
0/1 |
Enables live migration and NVIDIA vDPA |
|
No |
0/1 |
Dynamic MSIX for SR-IOV VFs on this PF. Only valid for PFs. |
|
No |
Number |
Control the number of MSIX tables to associate with this controller. Valid only for VFs (whose parent PF controller is created using the Note
This field is mandatory when the VF's MSIX is reclaimed using
|
|
No |
0/1 |
Support virtio-blk crash recovery. Enabling this parameter to 1 may impact virtio-blk performance (default is 0). For more information, refer to section "Virtio-blk Crash Recovery". |
|
No |
0/1 |
Enables indirect descriptors support for the controller's virt-queues. Note
When using the virtio-blk kernel driver, if indirect descriptors are enabled, it is always used by the driver. Using indirect descriptors for all IO traffic patterns may hurt performance in most cases.
|
|
No |
0/1 |
Creates read only virtio-blk controller. |
|
No |
0/1 |
Creates controller in suspended state. |
|
No |
0/1 |
Creates controller with the ability to listen for live update notifications via IPC. |
|
No |
0/1 |
N/A – not supported |
|
No |
0/1 |
N/A – not supported |
Example response:
{
"jsonrpc": "2.0",
"id": 1,
"result": "VblkCtrl1"
}
virtio_blk_controller_destroy
Destroy a previously created virtio-blk controller. The controller can be uniquely identified by the controller's name as acquired from virtio_blk_controller_create()
.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
force |
No |
Boolean |
Force destroying VF controller for SR-IOV |
virtio_blk_controller_suspend
While suspended, the controller stops receiving new requests from the host driver and only finishes handling of requests already in flight. All suspended requests (if any) are processed after resume.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
virtio_blk_controller_resume
After the controller stops receiving new requests from the host driver (i.e., is suspended) and only finishes handling of requests already in flight, the resume command will resume the handling of IOs by the controller.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
virtio_blk_controller_bdev_attach
Attach the specified bdev into virtIO-blk SNAP controller. It is possible to change the serial ID (using the vblk_id
parameter) if a new bdev is attached.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
Yes |
String |
Block device name |
|
No |
String |
Serial number for controller |
virtio_blk_controller_bdev_detach
You may replace the bdev for virtio-blk controller. First, you should detach bdev from the controller. When bdev is detached, the controller stops receiving new requests from the host driver (i.e., is suspended) and finishes handling requests already in flight only.
At this point, you may attach a new bdev or destroy the controller.
When a new bdev is attached, the controller resumes handling all outstanding I/Os.
The block size cannot be changed if the driver is loaded.
bdev may be replaced with a different block size if the driver is not loaded.
A controller with no bdev attached to it is considered a temporary state, in which the controller is not fully operational, and may not respond to some actions requested by the driver.
If there is no imminent intention to call virtio_blk_controller_bdev_attach
, it is advised to attach a none
bdev instead. For example:
snap_rpc.py virtio_blk_controller_bdev_attach -c VblkCtrl1 --bdev none --dbg_bdev_type null
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
virtio_blk_controller_list
List virtio-blk SNAP controller.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
String |
Controller name |
Example response:
{
"ctrl_id": "VblkCtrl2",
"vhca_id": 38,
"num_queues": 4,
"queue_size": 256,
"seg_max": 32,
"size_max": 65536,
"bdev": "Nvme1",
"plugged": true,
"indirect_desc": true,
"num_msix": 2,
"min configurable num_msix": 2,
"max configurable num_msix": 32
}
virtio_blk_controller_modify
This function allows user to modify some of the controller's parameters in real-time, after it was already created.
Modifications can only be done when the emulated function is in idle state - thus there is no driver communicating with it.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
String |
Controller Name |
num_queues |
No |
int |
Number of queues for the controller |
num_msix |
No |
int |
Number of MSIX to be used for a controller. Relevant only for VF controllers (when dynamic MSIX feature is enabled). |
Standard virtio-blk kernel driver currently does not support PCI FLR. As such,
virtio_blk_controller_dbg_io_stats_get
Debug counters are per-controller I/O stats that can help knowing the I/O distribution between different queues of the controller and the total I/O received on the controller.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
Example response:
"ctrl_id": "VblkCtrl2",
"queues": [
{
"queue_id": 0,
"core_id": 0,
"read_io_count": 19987068,
"write_io_count": 6319931,
"flush_io_count": 0
},
{
"queue_id": 1,
"core_id": 1,
"read_io_count": 9769556,
"write_io_count": 3180098,
"flush_io_count": 0
}
],
"read_io_count": 29756624,
"write_io_count": 9500029,
"flush_io_count": 0
}
virtio_blk_controller_dbg_debug_stats_get
Debug counters are per-controller debug statistics that can help knowing the controller and queues health and status.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
Example response:
{
"ctrl_id": "VblkCtrl1",
"queues": [
{
"qid": 0,
"state": "RUNNING",
"hw_available_index": 6,
"sw_available_index": 6,
"hw_used_index": 6,
"sw_used_index": 6,
"hw_received_descs": 13,
"hw_completed_descs": 13
},
{
"qid": 1,
"state": "RUNNING",
"hw_available_index": 2,
"sw_available_index": 2,
"hw_used_index": 2,
"sw_used_index": 2,
"hw_received_descs": 6,
"hw_completed_descs": 6
},
{
"qid": 2,
"state": "RUNNING",
"hw_available_index": 0,
"sw_available_index": 0,
"hw_used_index": 0,
"sw_used_index": 0,
"hw_received_descs": 4,
"hw_completed_descs": 4
},
{
"qid": 3,
"state": "RUNNING",
"hw_available_index": 0,
"sw_available_index": 0,
"hw_used_index": 0,
"sw_used_index": 0,
"hw_received_descs": 3,
"hw_completed_descs": 3
}
]
}
virtio_blk_controller_state_save
Save the state of the suspended virtio-blk SNAP controller.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
Yes |
String |
Filename to save state to |
virtio_blk_controller_state_restore
Restore the state of the suspended virtio-blk SNAP controller.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
Yes |
String |
Filename to save state to |
virtio_blk_controller_vfs_msix_reclaim
Reclaim virtio-blk SNAP controller VFs MSIX back to the free MSIX pool. Valid only for PFs.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
Virtio-blk Configuration Examples
Virtio-blk Configuration for Single Controller
spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage
snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1
Virtio-blk Cleanup for Single Controller
snap_rpc.py virtio_blk_controller_destroy -c VblkCtrl1
spdk_rpc.py bdev_nvme_detach_controller nvme0
Virtio-blk Dynamic Configuration For 125 VFs
Update the firmware configuration as described section "SR-IOV Firmware Configuration".
Reboot the host.
Run:
[dpu] spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage [dpu] snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 [host] modprobe -v virtio-pci && modprobe -v virtio-blk [host] echo 125 > /sys/bus/pci/devices/0000:86:00.3/sriov_numvfs [dpu] for i in `seq 0 124`; do snap_rpc.py virtio_blk_controller_create --pf_id 0 --vf_id $i --bdev nvme0n1; done;
NoteWhen SR-IOV is enabled, it is recommended to destroy virtio-blk controllers on VFs using the following and not the
virito_blk_controller_destroy
RPC command:[host] echo 0 > /sys/bus/pci/devices/0000:86:00.3/sriov_numvfs
To destroy a single virtio-blk controller, run:
[dpu] ./snap_rpc.py -t 1000 virtio_blk_controller_destroy -c VblkCtrl5 –f
Virtio-blk Suspend, Resume Example
[host] // Run fio
[dpu] snap_rpc.py virtio_blk_controller_suspend -c VBLKCtrl1
[host] // IOs will get suspended
[dpu] snap_rpc.py virtio_blk_controller_resume -c VBLKCtrl1
[host] // fio will resume sending IOs
Virtio-blk Bdev Attach, Detach Example
[host] // Run fio
[dpu] snap_rpc.py virtio_blk_controller_bdev_detach -c VBLKCtrl1
[host] // Bdev will be detached and IOs will get suspended
[dpu] snap_rpc.py virtio_blk_controller_bdev_attach -c VBLKCtrl1 --bdev null2
[host] // The null2 bdev will be attached into controller and fio will resume sending IOs
Notes
Virtio-blk protocol controller supports one backend device only
Virtio-blk protocol does not support administration commands to add backends. Thus, all backend attributes are communicated to the host virtio-blk driver over PCIe BAR and must be accessible during driver probing. Therefore, backends can only be changed once the PCIe function is not in use by any host storage driver.
NVMe Subsystem
The NVMe subsystem as described in the NVMe specification is a logical entity which encapsulates sets of NVMe backends (or namespaces) and connections (or controllers). NVMe subsystems are extremely useful when working with multiple NVMe controllers especially when using NVMe VFs. Each NVMe subsystem is defined by its serial number (SN), model number (MN), and qualified name (NQN) after creation.
The RPCs listed in this section control the creation and destruction of NVMe subsystems.
NVMe Namespace
NVMe namespaces are the representors of a continuous range of LBAs in the local/remote storage. Each namespace must be linked to a subsystem and have a unique identifier (NSID) across the entire NVMe subsystem (e.g., 2 namespaces cannot share the same NSID even if they are linked to different controllers).
After creation, NVMe namespaces can be attached to a controller.
SNAP does not currently support shared namespaces between different controllers. So, each namespace should be attached to a single controller.
The SNAP application uses an SPDK block device framework as a backend for its NVMe namespaces. Therefore, they should be configured in advance. For more information about SPDK block devices, see SPDK bdev documentation and Appendix SPDK Configuration.
NVMe Controller
Each NVMe device (e.g., NVMe PCIe entry) exposed to the host, whether it is a PF or VF, must be backed by NVMe controller, which is responsible for all protocol communication with the host's driver.
Every new NVMe controller must also be linked to an NVMe subsystem. After creation, NVMe controllers can be addressed using either their name (e.g., "Nvmectrl1") or both their subsystem NQN and controller ID.
Attaching NVMe Namespace to NVMe Controller
After creating an NVMe controller and an NVMe namespace under the same subsystem, the following method is used to attach the namespace to the controller.
NVMe Emulation Management Command
Command |
Description |
Create NVMe subsystem |
|
Destroy NVMe subsystem |
|
NVMe subsystem list |
|
Create NVMe namespace |
|
Destroy NVMe namespace |
|
Suspend NVMe controller |
|
Resume NVMe controller |
|
Take snapshot of NVMe controller to a file |
|
NVMe namespace list |
|
Create new NVMe controller |
|
Destroy NVMe controller |
|
NVMe controller list |
|
NVMe controller parameters modification |
|
Attach NVMe namespace to controller |
|
Detach NVMe namespace from controller |
|
Reclaim NVMe SNAP controller VFs MSIX back to free MSIX pool. Valid only for PFs. |
|
Get NVMe controller IO debug stats |
nvme_subsystem_create
Create a new NVMe subsystem to be controlled by one or more NVMe SNAP controllers. An NVMe subsystem includes one or more controllers, zero or more namespaces, and one or more ports. An NVMe subsystem may include a non-volatile memory storage medium and an interface between the controller(s) in the NVMe subsystem and non-volatile memory storage medium.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Subsystem qualified name |
|
No |
String |
Subsystem serial number |
|
No |
String |
Subsystem model number |
|
No |
Number |
Maximal namespace ID allowed in the subsystem (default 0xFFFFFFFE; range 1-0xFFFFFFFE) |
|
No |
Number |
Maximal number of namespaces allowed in the subsystem (default 1024; range 1-0xFFFFFFFE) |
Example request:
{
"jsonrpc": "2.0",
"id": 1,
"method": "nvme_subsystem_create",
"params": {
"nqn": "nqn.2022-10.io.nvda.nvme:0"
}
}
nvme_subsystem_destroy
Destroy (previously created) NVMe SNAP subsystem.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Subsystem qualified name |
|
No |
Bool |
Force the deletion of all the controllers and namespaces under the subsystem |
nvme_subsystem_list
List NVMe subsystems.
nvme_namespace_create
Create new NVMe namespaces that represent a continuous range of LBAs in the previously configured bdev. Each namespace must be linked to a subsystem and have a unique identifier (NSID) across the entire NVMe subsystem.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Subsystem qualified name |
|
Yes |
String |
SPDK block device to use as backend |
|
Yes |
Number |
Namespace ID |
|
No |
Number |
Namespace UUID Note
To safely detach/attach namespaces, the UUID should be provided to force the UUID to remain persistent.
|
|
No |
0/1 |
N/A – not supported |
nvme_namespace_destroy
Destroy a previously created NVMe namespaces.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Subsystem qualified name |
|
Yes |
Number |
Namespace ID |
nvme_namespace_list
List NVMe SNAP namespaces.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
String |
Subsystem qualified name |
nvme_controller_create
Create a new SNAP-based NVMe blk controller over a specific PCIe function on the host.
To specify the PCIe function to open the controller upon, pci_index
must be provided.
The mapping for pci_index
can be queried by running emulation_function_list
.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Subsystem qualified name |
|
No |
Number |
VUID of PCIe function |
|
No |
Number |
PCIe PF index to start emulation on |
|
No |
Number |
PCIe VF index to start emulation on (if the controller is destined to be opened on a VF) |
|
No |
String |
PCIe BDF to start emulation on |
|
No |
Number |
vHCA ID of PCIe function |
|
No |
Number |
Controller ID |
|
No |
Number |
Number of IO queues (default 1, range 1-31). Note
The actual number of queues is limited by the number of queues supported by the hardware.
Tip
It is recommended for the number of MSIX to match be greater than the number of IO queues.
|
|
No |
Number |
MDTS (default 7, range 1-7) |
|
No |
Number |
Maximum number firmware slots (default 4) |
|
No |
0/1 |
Enable the |
|
No |
0/1 |
Set the value of the |
|
No |
0/1 |
Set the value of the Note
During crash recovery, all compare and write commands are expected to fail.
|
|
No |
0/1 |
Set the value of the |
|
No |
0/1 |
Open the controller in suspended state (requires an additional call to Note
This is required if NVMe recovery is expected or when creating the controller when the driver is already loaded. Therefore, it is advisable to use it in all scenarios. To resume the controller after attaching namespaces, use
|
|
No |
String |
Create a controller out of a snapshot file path. Snapshot is previously taken using |
|
No |
0/1 |
Enable dynamic MSIX management for the controller (default 0). Applies only for PFs. |
|
No |
Number |
Control the number of MSIX tables to associate with this controller. Valid only for VFs (whose parent PF controller is created using the Note
This field is mandatory when the VF's MSIX is reclaimed using
|
|
No |
0/1 |
Creates NVMe controller with admin queues only (i.e., without IO queues) |
|
No |
Number |
Bitmask to support buggy drivers which are non-compliant per NVMe specification.
For more details, see section "OS Issues". |
If not set, the SNAP NVMe controller supports an optional NVMe command only if all the namespaces attached to it when loading the driver support it. To bypass this feature, you may explicitly set the NVMe optional command support bit by using its corresponding flag.
For example, a controller created with –-compare 0
would not support the optional compare
NVMe command regardless of its attached namespaces.
Example request:
{
"jsonrpc": "2.0",
"id": 1,
"method": "nvme_controller_create",
"params": {
"nqn": "nqn.2022-10.io.nvda.nvme:0",
"pf_id": 0,
"num_queues": 8,
}
}
nvme_controller_destroy
Destroy a previously created NVMe controller. The controller can be uniquely identified by a controller name as acquired from nvme_controller_create
.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
No |
1/0 |
Release MSIX back to free pool. Applies only for VFs. |
nvme_controller_suspend
While suspended, the controller stops handling new requests from the host driver. All pending requests (if any) will be processed after resume.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
No |
Number |
Suspend timeout Note
If IOs are pending in the bdev layer (or in the remote target), the operation fails and resumes after this timeout. If
|
|
No |
0/1 |
Force suspend even when there are inflight I/Os |
|
No |
0/1 |
Suspend only the admin queue |
|
No |
0/1 |
Send a live update notification via IPC |
nvme_controller_resume
The resume command continues the (previously-suspended) controller's handling of new requests sent by the driver. If the controller is created in suspended mode, resume is also used to start initial communication with host driver.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
No |
0/1 |
Live update resume |
nvme_controller_snapshot_get
Take a snapshot of the current state of the controller and dump it into a file. This file may be used to create a controller based on this snapshot. For the snapshot to be consistent, users should call this function only when the controller is suspended (see nvme_controller_suspend
).
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
Yes |
String |
File path |
nvme_controller_vfs_msix_reclaim
Reclaims all VFs MSIX back to the PF's free MSIX pool.
This function can only be applied on PFs and can only be run when SR-IOV is not set on host side (i.e., sriov_numvfs = 0
).
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
nvme_controller_list
Provide a list of all active (created) NVMe controllers with their characteristics.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
String |
Subsystem qualified name |
|
No |
String |
Only search for a specific controller |
nvme_controller_modify
This function allows user to modify some of the controller's parameters in real-time, after it was already created.
Modifications can only be done when the emulated function is in idle state - thus there is no driver communicating with it.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
No |
String |
Controller Name |
num_queues |
No |
int |
Number of queues for the controller |
num_msix |
No |
int |
Number of MSIX to be used for a controller. Relevant only for VF controllers (when dynamic MSIX feature is enabled). |
nvme_controller_attach_ns
Attach a previously created NVMe namespace to given NVMe controller under the same subsystem.
The result in the response object returns true
for success and false
for failure.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
Yes |
Number |
Namespace ID |
nvme_controller_detach_ns
Detach a previously attached namespace with a given NSID from the NVMe controller.
The result in the response object returns true
for success and false
for failure.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
|
Yes |
Number |
Namespace ID |
nvme_controller_dbg_io_stats_get
The result in the response object returns true
for success and false
for failure.
Command parameters:
Parameter |
Mandatory? |
Type |
Description |
|
Yes |
String |
Controller name |
"ctrl_id": "NVMeCtrl2",
"queues": [
{
"queue_id": 0,
"core_id": 0,
"read_io_count": 19987068,
"write_io_count": 6319931,
"flush_io_count": 0
},
{
"queue_id": 1,
"core_id": 1,
"read_io_count": 9769556,
"write_io_count": 3180098,
"flush_io_count": 0
}
],
"read_io_count": 29756624,
"write_io_count": 9500029,
"flush_io_count": 0
}
NVMe Configuration Examples
NVMe Configuration for Single Controller
On the DPU:
spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage
snap_rpc.py nvme_subsystem_create --nqn nqn.2022-10.io.nvda.nvme:0
snap_rpc.py nvme_namespace_create -b nvme0n1 -n 1 --nqn nqn.2022-10.io.nvda.nvme:0 --uuid 263826ad-19a3-4feb-bc25-4bc81ee7749e
snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --pf_id 0 --suspended
snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl1 -n 1
snap_rpc.py nvme_controller_resume -c NVMeCtrl1
It is necessary to create a controller in a suspended state. Afterward, the namespaces can be attached, and only then should the controller be resumed using the nvme_controller_resume
RPC.
To safely detach/attach namespaces, the UUID must be provided to force the UUID to remain persistent.
NVMe Cleanup for Single Controller
snap_rpc.py nvme_controller_detach_ns -c NVMeCtrl2 -n 1
snap_rpc.py nvme_controller_destroy -c NVMeCtrl2
snap_rpc.py nvme_namespace_destroy -n 1 --nqn nqn.2022-10.io.nvda.nvme:0
snap_rpc.py nvme_subsystem_destroy --nqn nqn.2022-10.io.nvda.nvme:0
spdk_rpc.py bdev_nvme_detach_controller nvme0
NVMe and Hotplug Cleanup for Single Controller
snap_rpc.py nvme_controller_detach_ns -c NVMeCtrl1 -n 1
snap_rpc.py emulation_device_detach_prepare --vuid MT2114X12200VBLKS1D0F0
snap_rpc.py nvme_controller_destroy -c NVMeCtrl1
snap_rpc.py emulation_device_detach --vuid MT2114X12200VBLKS1D0F0
snap_rpc.py nvme_namespace_destroy -n 1 --nqn nqn.2022-10.io.nvda.nvme:0
snap_rpc.py nvme_subsystem_destroy --nqn nqn.2022-10.io.nvda.nvme:0
spdk_rpc.py bdev_nvme_detach_controller nvme0
NVMe Configuration for 125 VFs SR-IOV
Update the firmware configuration as described section "SR-IOV Firmware Configuration".
Reboot the host.
Create a dummy controller on the parent PF:
[dpu] # snap_rpc.py nvme_subsystem_create --nqn nqn.2022-10.io.nvda.nvme:0 [dpu] # snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --ctrl NVMeCtrl1 --pf_id 0 --admin_only
Create 125 Bdevs (Remote or Local), 125 NSs and 125 controllers:
[dpu] for i in `seq 0 124`; do \ # spdk_rpc.py bdev_null_create null$((i+1)) 64 512; # snap_rpc.py nvme_namespace_create -b null$((i+1)) -n $((i+1)) --nqn nqn.2022-10.io.nvda.nvme:0 --uuid 3d9c3b54-5c31-410a-b4f0-7cf2afd9e$((i+100)); # snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --ctrl NVMeCtrl$((i+2)) --pf_id 0 --vf_id $i --suspended; # snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl$((i+2)) -n $((i+1)); # snap_rpc.py nvme_controller_resume -c NVMeCtrl$(i+2); done
Load the driver and configure VFs:
[host] # modprobe -v nvme [host] # echo 125 > /sys/bus/pci/devices/0000\:25\:00.2/sriov_numvfs
snap_global_param_list
snap_global_param_list
lists all existing environment variables.
The following is an example response for the snap_global_param_lis
command:
[
"SNAP_ENABLE_POLL_SKIP : set : 0 ",
"SNAP_POLL_CYCLE_SIZE : not set : 16 ",
"SNAP_RPC_LOG_ENABLE : set : 1 ",
"SNAP_MEMPOOL_SIZE_MB : set : 1024",
"SNAP_MEMPOOL_4K_BUFFS_PER_CORE : not set : 1024",
"SNAP_RDMA_ZCOPY_ENABLE : set : 1 ",
"SNAP_TCP_XLIO_ENABLE : not set : 1 ",
"SNAP_TCP_XLIO_TX_ZCOPY : not set : 1 ",
"MLX5_SHUT_UP_BF : not set : 0 ",
"SNAP_SHARED_RX_CQ : not set : 1 ",
"SNAP_SHARED_TX_CQ : not set : 1 ",
...