DOCA Documentation v3.1.0

On This Page

SNAP-4 Service RPC Commands

Remote procedure call (RPC) protocol is used to control the SNAP service. NVMe/virtio-blk SNAP, like other standard SPDK applications, supports JSON-based RPC protocol commands to control any resources and create, delete, query, or modify commands easily from CLI.

SNAP supports all standard SPDK RPC commands in addition to an extended SNAP-specific command set. SPDK standard commands are executed by the spdk_rpc.py tool while the SNAP-specific command set extension is executed by the snap_rpc.py tool.

Full spdk_rpc.py command set documentation can be found in the SPDK official documentation site.

Full snap_rpc.py extended commands are detailed further down in this chapter.

The JSON-based RPC protocol can be used via the snap_rpc.py script that is installed inside the SNAP container and can be accessed using crictl tool.

Info

The SNAP container is CRI-compatible.

  • To query the active container ID:

    Copy
    Copied!
                

    crictl ps -s running -q --name snap

  • To post RPCs to the container using crictl:

    Copy
    Copied!
                

    crictl exec <container-id> snap_rpc.py <RPC-method>

    For example:

    Copy
    Copied!
                

    crictl exec 0379ac2c4f34c snap_rpc.py emulation_function_list

    In addition, an alias can be used:

    Copy
    Copied!
                

    alias snap_rpc.py="crictl ps -s running -q --name snap | xargs -I{} crictl exec -i {} snap_rpc.py " alias spdk_rpc.py="crictl ps -s running -q --name snap | xargs -I{} crictl exec -i {} spdk_rpc.py "

  • To open a bash shell to the container that can be used to post RPCs:

    Copy
    Copied!
                

    crictl exec -it <container-id> bash

snap_log_level_set

SNAP allows dynamically changing the log level of the logger backend using the snap_log_level_set. Any log under the requested level is shown.

Parameter

Mandatory?

Type

Description

level

Yes

Number

Log level

  • 0 – Critical

  • 1 – Error

  • 2 – Warning

  • 3 – Info

  • 4 – Debug

  • 5 – Trace

Emulated PCIe functions are managed through IB devices called emulation managers. Emulation managers are ordinary IB devices with special privileges to control PCIe communication and device emulations towards the host OS.

SNAP queries an emulation manager that supports the requested set of capabilities.

The emulation manager holds a list of the emulated PCIe functions it controls. PCIe functions may be approached later in 3 ways:

  • vuid – recommended as it is guaranteed to remain constant (see appendix "PCIe BDF to VUID Translation" for details)

  • vhca_id

  • Function index (i.e., pf_id or vf_id)

emulation_function_list

emulation_function_list lists all existing functions.

The following is an example response for the emulation_function_list command:

Copy
Copied!
            

[ { "hotplugged": true, "hotplug state": "POWER_ON", "emulation_type": "VBLK", "pf_index": 0, "pci_bdf": "87:00.0", "vhca_id": 5, "vuid": "MT2306XZ009TVBLKS1D0F0", "ctrl_id": "VblkCtrl1", "num_vfs": 0, "vfs": [] } ]

Note

Use -a or --all, to show all inactive VF functions.

SNAP supports 2 types of PCIe functions:

  • Static functions – PCIe functions configured at the firmware configuration stage (physical and virtual). Refer to appendix "DPU Firmware Configuration" for additional information.

  • Hot-pluggable functions – PCIe functions configured dynamically at runtime. Users can add detachable functions. Refer to section "Hot-pluggable PCIe Functions Management" for additional information.

emulation_resource_list

This RPC retrieves information about the global resources available for emulated functions.

It is useful for verifying whether sufficient MSI-X or queue resources exist before creating a new hotplugged device.

The following is an example response for the emulation_resource_list command:

Copy
Copied!
            

[ { "num_available_msix_resources": 9636, "num_available_queue_resources": 32728 } ]


Hotplug PCIe functions are configured dynamically at runtime using RPCs. Once a new PCIe function is hot plugged, it appears in the host’s PCIe device list and remains persistent until explicitly unplugged or the system undergoes a cold reboot. Importantly, this persistence continues even if the SNAP process terminates. Therefore, it is advised not to include hotplug/hotunplug actions in automatic initialization scripts (e.g., snap_rpc_init.conf).

Note

Hotplug PFs do not support SR-IOV.

Virtio-blk Two-step PCIe Hotplug

The following RPC commands are used to dynamically add or remove PCIe PFs (i.e., hot-plugged functions) in the DPU application.

Once a PCIe function is created (via virtio_blk_function_create), it is accessible and manageable within the DPU application but is not immediately visible to the host OS/kernel. This differs from the legacy API, where creation and host exposure occurs simultaneously. Instead, exposing or hiding PCIe functions to the host OS is managed by separate RPC commands (virtio_blk_controller_hotplug and virtio_blk_controller_hotunplug). After hot unplugging, the function can be safely removed from the DPU (using virtio_blk_function_destroy).

A key advantage of this approach is the ability to pre-configure a controller on the function, enabling it to serve the host driver as soon as it is exposed. In fact, users must create a controller to use the virtio_blk_controller_hotplug API, which is required to make the function visible to the host OS.

controller_hotplug and controller_hotunplug also have an argument named wait_for done. When this argument is set, RPC response will block until either host acknowledges the action and adds/removes the PCIe function from its list (or host turns out to be temporarily unavailable). If not set, it is the user's responsibility to validate the function's hotplug state (can be queried using emulation_functions_list RPC).

Note

It is generally advised to use wait_for_done flag whenever a single hotplug/unplug operation is performed. However, when performing multiple hotplug/unplug operations at once, a more time-efficient approach would be to perform all actions at once (without wait_for_done ), then validating their hotplug status altogether using a single RPC call at the end.

In some cases, host might become (temporarily) unavailable to accept new PCIe hotplug/unplug functions (typically, during host OS reboot, and until PCIe device enumeration has taken place by host kernel OS). Throughout this timeframe, any attempt to change PCIe device list (by hotplug/unplug PCIe functions) will be blocked by FW. For clarity, using wait_for_done flag when hotplugging/unplugging a PCIe function does not imply that SW will retry to execute the operation until successful in case host is unavailable - it is the RPC user responsibility to actively retry (if desired) in case of such a failure.

Note

Host is considered (temporarily) unavailable until PCIe device enumeration has been taken place by host OS.

virtio_blk_function_create

Create a new virtio-blk emulation function.

Command parameters:

Parameter

Mandatory?

Type

Description

manager

No

String

Emulation manager to manage hotplug function (unused)


virtio_blk_function_destroy

Delete an existing virtio-blk emulation function.

Command parameters:

Parameter

Mandatory?

Type

Description

vuid

Yes

String

Identifier of the hotplugged function to delete


virtio_blk_controller_hotplug

Exposes (hot plugs) the emulation function to the host OS.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller to expose to the host OS

wait_for_done

No

Bool

Block until host discovers and acknowledges the new command


virtio_blk_controller_hotunplug

Removes (hot unplugs) the emulation function from the host OS.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller to expose to the host OS

wait_for_done

No

Bool

Block until host identifies and removes the function

Note

When not using wait_for_done approach, it is the user's responsibility to verify host identifies the new hotplugged function. This can be done by querying the pci_hotplug_state parameter in emulation_function_list RPC output.


Virtio-blk Two-step PCIe Hotplug/Unplug Example

Copy
Copied!
            

# Bringup spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage snap_rpc.py virtio_blk_function_create snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1 snap_rpc.py virtio_blk_controller_hotplug -c VblkCtrl1 --wait_for_done   # Cleanup snap_rpc.py virtio_blk_controller_hotunplug -c VblkCtrl1 --wait_for_done snap_rpc.py virtio_blk_controller_destroy -c VblkCtrl1 snap_rpc.py virtio_blk_function_destroy --vuid MT2114X12200VBLKS1D0F0 spdk_rpc.py bdev_nvme_detach_controller nvme0

(Deprecated) Virtio-blk PCIe Hotplug Legacy API

Info

Support for the legacy API ends in January 31, 2026.

Hotplug Commands

The following commands hot plug a new Virtio-blk PCIe function to the system.

After a new PCIe function is plugged, it is immediately shown on the host's PCIe devices list until it is either explicitly unplugged or the system goes through a cold reboot. Therefore, it is user responsibility to open a controller instance to manage the new function immediately after a function's creation. Keeping a hotplugged function without a matching controller to manage may cause anomalous behavior on the host OS driver.

virtio_blk_emulation_device_attach

Attach virtio-blk emulation function.

Command parameters:

Parameter

Mandatory?

Type

Description

id

No

Number

Device ID

vid

No

Number

Vendor ID

ssid

No

Number

Subsystem device ID

ssvid

No

Number

Subsystem vendor ID

revid

No

Number

Revision ID

class_code

No

Number

Class code

num_msix

No

Number

MSI-X table size

total_vf

No

Number

Maximal number of VFs allowed

bdev

No

String

Block device to use as backend

num_queues

No

Number

Number of IO queues (default 1, range 1-62).

Note

The actual number of queues is limited by the number of queues supported by the hardware.

Tip

It is recommended that the number of MSIX be greater than the number of IO queues (1 is used for the config interrupt).

queue_depth

No

Number

Queue depth (default 256, range 1-256)

Note

It is only possible to modify the queue depth if the driver is not loaded.

transitional_device

No

Boolean

Transitional device support. See section "Virtio-blk Transitional Device Support" for more details.

dbg_bdev_type

No

Boolean

N/A – not supported

Hot Unplug Commands

The following commands hot-unplug a PCIe function from the system in 2 steps:

Command

Description

1

emulation_device_detach_prepare

Prepare emulation function to be detached

2

emulation_device_detach

Detach emulation function

emulation_device_detach_prepare

This is the first step for detaching an emulation device. It prepares the system to detach a hot plugged emulation function. In case of success, the host's hotplug device state changes and you may safely proceed to the emulation_device_detach command.

The controller attached to the emulation function must be created and active when executing this command.

Command parameters:

Parameter

Mandatory?

Type

Description

vhca_id

No

Number

vHCA ID of PCIe function

vuid

No

String

PCIe device VUID

ctrl

No

String

Controller ID

Note

At least one identifier must be provided to describe the PCIe function to be detached.


emulation_device_detach

This is the second step which completes detaching of the hotplugged emulation function. If the detach preparation times out, you may perform a surprise unplug using --force with the command.

Note

The driver must be unprobed, otherwise errors may occur.

Command parameters:

Parameter

Mandatory?

Type

Description

vhca_id

No

Number

vHCA ID of PCIe function

vuid

no

String

PCIe device VUID

force

No

Boolean

Detach with failed preparation

Note

At least one identifier must be provided to describe the PCIe function to be detached.

Virtio-blk Hot Plug/Unplug Example

Copy
Copied!
            

// Bringup spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage snap_rpc.py virtio_blk_emulation_device_attach snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1   // Cleanup snap_rpc.py emulation_device_detach_prepare --vuid MT2114X12200VBLKS1D0F0 snap_rpc.py virtio_blk_controller_destroy -c VblkCtrl1 snap_rpc.py emulation_device_detach --vuid MT2114X12200VBLKS1D0F0 spdk_rpc.py bdev_nvme_detach_controller nvme0

NVMe PCIe Hotplug API

Hotplug Legacy Commands

Note

The two-step API is not yet supported for the NVMe protocol.

The following commands perform hotplug operations for a new NVMe PCIe function.

Once a PCIe function is attached, it appears immediately in the host's PCIe device list and remains there until explicitly detached or until a cold reboot occurs. It is the user's responsibility to create and activate a controller instance to manage the new function immediately after attachment. Leaving a hotplugged function unmanaged may cause anomalous behavior in the host OS driver.

nvme_emulation_device_attach

Attaches an NVMe emulation function.

Command parameters:

Parameter

Mandatory?

Type

Description

id

No

Number

Device ID

vid

No

Number

Vendor ID

ssid

No

Number

Subsystem device ID

ssvid

No

Number

Subsystem vendor ID

revid

No

Number

Revision ID

class_code

No

Number

Class code

num_msix

No

Number

MSI-X table size

total_vf

No

Number

Maximal number of VFs allowed

num_queues

No

Number

Number of IO queues (default 31, range 1-31).

Note

The actual number of queues is limited by the number of queues supported by the hardware.

Tip

It is recommended that the number of MSIX be greater than the number of IO queues (1 is used for the config interrupt).

version

No

String

Specification version (currently only 1.4 is supported)

Hot Unplug Legacy Commands

The following commands hot-unplug a PCIe function from the system in 2 steps:

Command

Description

1

emulation_device_detach_prepare

Prepare emulation function to be detached

2

emulation_device_detach

Detach emulation function

emulation_device_detach_prepare

Prepares the system to detach a hotplugged emulation function.

This is the first step in the detachment sequence. Upon success, the device enters a safe state for removal, allowing you to proceed with the emulation_device_detach command.

Note

A controller must be active and attached to the emulation function before executing this command.

Command parameters:

Parameter

Mandatory?

Type

Description

vhca_id

No

Number

vHCA ID of PCIe function

vuid

No

String

PCIe device VUID

ctrl

No

String

Controller ID

Note

At least one identifier must be provided to describe the PCIe function to be detached.


emulation_device_detach

Completes the detachment of a hotplugged emulation function.

If the preparation phase (emulation_device_detach_prepare) times out, use the --force option to perform a surprise unplug.

Note

Ensure the driver has been properly unbound from the device before running this command, or errors may occur.

Command parameters:

Parameter

Mandatory?

Type

Description

vhca_id

No

Number

vHCA ID of PCIe function

vuid

no

String

PCIe device VUID

force

No

Boolean

Detach with failed preparation

Note

At least one identifier must be provided to describe the PCIe function to be detached.

NVMe Hot Plug/Unplug Example

Copy
Copied!
            

// Bringup spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage snap_rpc.py nvme_subsystem_create --nqn nqn.2022-10.io.nvda.nvme:0 snap_rpc.py nvme_namespace_create -b nvme0n1 -n 1 --nqn nqn.2022-10.io.nvda.nvme:0 --uuid 263826ad-19a3-4feb-bc25-4bc81ee7749e snap_rpc.py nvme_emulation_device_attach snap_rpc.py nvme_controller_create --vuid MT240830045RNVMES1D0F0 --nqn nqn.2022-10.io.nvda.nvme:0 --suspended snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl1 -n 1 snap_rpc.py nvme_controller_resume -c NVMeCtrl1   // Cleanup snap_rpc.py nvme_controller_detach_ns -c NVMeCtrl1 -n 1 snap_rpc.py emulation_device_detach_prepare --vuid MT240830045RNVMES1D0F0 snap_rpc.py nvme_controller_destroy -c NVMeCtrl1 snap_rpc.py emulation_device_detach --vuid MT240830045RNVMES1D0F0 snap_rpc.py nvme_namespace_destroy -n 1 --nqn nqn.2022-10.io.nvda.nvme:0 snap_rpc.py nvme_subsystem_destroy --nqn nqn.2022-10.io.nvda.nvme:0 spdk_rpc.py bdev_nvme_detach_controller nvme0

Info

End of support for SPDK bdev management is January of 2026.

The following RPCs are deprecated and are no longer supported:

  • spdk_bdev_create

  • spdk_bdev_destroy

  • bdev_list

These RPCs were optional. If not performed, SNAP would automatically generate SNAP block devices (bdevs).

Virtio-blk emulation is a storage protocol belonging to the virtio family of devices. These devices are found in virtual environments yet by design look like physical devices to the user within the virtual machine.

Each virtio-blk device (e.g., virtio-blk PCIe entry) exposed to the host, whether it is PF or VF, must be backed by a virtio-blk controller.

Note

Virtio-blk limitations:

  • Probing a virtio-blk driver on the host without an already functioning virtio-blk controller may cause the host to hang until such controller is opened successfully (no timeout mechanism exists).

  • Upon creation of a virtio-blk controller, a backend device must already exist.

Virtio-blk Emulation Management Commands

virtio_blk_controller_create

Create a new SNAP-based virtio-blk controller over a specific PCIe function on the host. To specify the PCIe function to open a controller upon must be provided as described in section "PCIe Function Management":

  1. vuid (recommended as it is guaranteed to remain constant).

  2. vhca_id.

  3. Function index – pf_id, vf_id.

The mapping for pci_index can be queried by running emulation_function_list.

Command parameters:

Parameter

Mandatory?

Type

Description

vuid

No

String

PCIe device VUID

vhca_id

No

Number

vHCA ID of PCIe function

pf_id

No

Number

PCIe PF index to start emulation on

vf_id

No

Number

PCIe VF index to start emulation on (if the controller is meant to be opened on a VF)

pci_bdf

No

String

PCIe device BDF

ctrl

No

String

Controller ID

num_queues

No

Number

Number of IO queues (default 1, range 1-256).

Tip

It is recommended that the number of MSIX be greater than the number of IO queues (1 is used for the config interrupt).

Based on effective num_msix value (can be queried from virtio_blk_controller_list RPC), it can be later aligned using virtio_blk_controller_modify RPC).

queue_size

No

Number

Queue size (default 256, range 1-256). Must be a power of 2.

size_max

No

Number

Maximal SGE data transfer size (default 65536 bytes, range 4K - 2M)

Note

SNAP's internal memory buffer pool supports a maximum allocation size of 2 MB. This should guide the user when setting the combined value of size_max * seg_max. This limit is not strictly enforced. If the backend supports zero-copy datapath, users may set size_max beyond 2 MB based on their performance or application needs.

seg_max

No

Number

Maximal SGE list length (default 32, range 1-queue_size)

Note

seg_max is implicitly upper-bounded by virtio queue size in virtio-blk specification. As such, any provided seg_max value larger than queue_size will be silently clamped down to equal queue_size value.

bdev

No

String

SNAP SPDK block device to use as backend

vblk_id

No

String

Serial number for the controller

admin_q

No

0/1

Enables live migration and NVIDIA vDPA

Note

This field is only applicable to controllers created through physical functions (PFs).

dynamic_msix

No

0/1

Dynamic MSIX for SR-IOV VFs on this PF. Only valid for PFs.

vf_num_msix

No

Number

Control the number of MSIX tables to associate with this controller. Valid only for VFs (whose parent PF controller is created using the --dynamic_msix option) and only when the dynamic MSIX management feature is enabled. Must be an even number ≥ 2.

Note

This field is mandatory when the VF's MSIX is reclaimed using virtio_blk_controller_vfs_msix_reclaim or released using --release_msix on virtio_blk_controller_destroy.

force_in_order

No

0/1

Support virtio-blk crash recovery. Enabling this parameter to 1 may impact virtio-blk performance (default is 0). For more information, refer to section "Virtio-blk Crash Recovery".

indirect_desc

No

0/1

Enables indirect descriptors support for the controller's virt-queues.

Note

When using the virtio-blk kernel driver, if indirect descriptors are enabled, it is always used by the driver. Using indirect descriptors for all IO traffic patterns may hurt performance in most cases.

read_only

No

0/1

Creates read only virtio-blk controller.

suspended

No

0/1

Creates controller in suspended state.

live_update_listener

No

0/1

Creates controller with the ability to listen for live update notifications via IPC.

dbg_bdev_type

No

0/1

N/A – not supported

dbg_local_optimized

No

0/1

N/A – not supported

latency_optimized

No

0/1

Create controller with (IO read) latency optimization.

Note

This latency optimization applies only to non-ZCOPY flows. Using this flag may reduce bandwidth performance.

Example response:

Copy
Copied!
            

{ "jsonrpc": "2.0", "id": 1, "result": "VblkCtrl1" }


virtio_blk_controller_destroy

Destroy a previously created virtio-blk controller. The controller can be uniquely identified by the controller's name as acquired from virtio_blk_controller_create().

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

force

No

Boolean

Force destroying VF controller for SR-IOV


virtio_blk_controller_suspend

While suspended, the controller stops receiving new requests from the host driver and only finishes handling of requests already in flight. All suspended requests (if any) are processed after resume.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name


virtio_blk_controller_resume

After the controller stops receiving new requests from the host driver (i.e., is suspended) and only finishes handling of requests already in flight, the resume command will resume the handling of IOs by the controller.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name


virtio_blk_controller_bdev_attach

Attach the specified bdev into virtIO-blk SNAP controller. It is possible to change the serial ID (using the vblk_id parameter) if a new bdev is attached.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

bdev

Yes

String

Block device name

vblk_id

No

String

Serial number for controller


virtio_blk_controller_bdev_detach

You may replace the bdev for virtio-blk controller. First, you should detach bdev from the controller. When bdev is detached, the controller stops receiving new requests from the host driver (i.e., is suspended) and finishes handling requests already in flight only.

At this point, you may attach a new bdev or destroy the controller.

When a new bdev is attached, the controller resumes handling all outstanding I/Os.

Note

The block size cannot be changed if the driver is loaded.

bdev may be replaced with a different block size if the driver is not loaded.

Note

A controller with no bdev attached to it is considered a temporary state, in which the controller is not fully operational, and may not respond to some actions requested by the driver.

If there is no imminent intention to call virtio_blk_controller_bdev_attach, it is advised to attach a none bdev instead. For example:

Copy
Copied!
            

snap_rpc.py virtio_blk_controller_bdev_attach -c VblkCtrl1 --bdev none --dbg_bdev_type null

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name


virtio_blk_controller_list

List virtio-blk SNAP controller.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

No

String

Controller name

Example response:

Copy
Copied!
            

{ "ctrl_id": "VblkCtrl2", "vhca_id": 38, "num_queues": 4, "queue_size": 256, "seg_max": 32, "size_max": 65536, "bdev": "Nvme1", "plugged": true, "indirect_desc": true, "num_msix": 2, "min configurable num_msix": 2, "max configurable num_msix": 32 }


virtio_blk_controller_modify

This function allows user to modify some of the controller's parameters in real-time, after it was already created.

Modifications can only be done when the emulated function is in idle state - thus there is no driver communicating with it.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

No

String

Controller Name

num_queues

No

int

Number of queues for the controller

num_msix

No

int

Number of MSIX to be used for a controller.

Relevant only for VF controllers (when dynamic MSIX feature is enabled).

Note

Standard virtio-blk kernel driver currently does not support PCIe FLR. As such,


virtio_blk_controller_dbg_io_stats_get

Debug counters are per-controller I/O stats that can help knowing the I/O distribution between different queues of the controller and the total I/O received on the controller.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

Example response:

Copy
Copied!
            

"ctrl_id": "VblkCtrl2", "queues": [ { "queue_id": 0, "core_id": 0, "read_io_count": 19987068, "write_io_count": 6319931, "flush_io_count": 0 }, { "queue_id": 1, "core_id": 1, "read_io_count": 9769556, "write_io_count": 3180098, "flush_io_count": 0 } ], "read_io_count": 29756624, "write_io_count": 9500029, "flush_io_count": 0 }


virtio_blk_controller_dbg_debug_stats_get

Debug counters are per-controller debug statistics that can help knowing the controller and queues health and status.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

Example response:

Copy
Copied!
            

{ "ctrl_id": "VblkCtrl1", "queues": [ { "qid": 0, "state": "RUNNING", "hw_available_index": 6, "sw_available_index": 6, "hw_used_index": 6, "sw_used_index": 6, "hw_received_descs": 13, "hw_completed_descs": 13 }, { "qid": 1, "state": "RUNNING", "hw_available_index": 2, "sw_available_index": 2, "hw_used_index": 2, "sw_used_index": 2, "hw_received_descs": 6, "hw_completed_descs": 6 }, { "qid": 2, "state": "RUNNING", "hw_available_index": 0, "sw_available_index": 0, "hw_used_index": 0, "sw_used_index": 0, "hw_received_descs": 4, "hw_completed_descs": 4 }, { "qid": 3, "state": "RUNNING", "hw_available_index": 0, "sw_available_index": 0, "hw_used_index": 0, "sw_used_index": 0, "hw_received_descs": 3, "hw_completed_descs": 3 } ] }


virtio_blk_controller_vfs_msix_reclaim

Reclaim virtio-blk SNAP controller VFs MSIX back to the free MSIX pool. Valid only for PFs.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

Virtio-blk Configuration Examples

Virtio-blk Configuration for Single Controller

Copy
Copied!
            

spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 --bdev nvme0n1


Virtio-blk Cleanup for Single Controller

Copy
Copied!
            

snap_rpc.py virtio_blk_controller_destroy -c VblkCtrl1 spdk_rpc.py bdev_nvme_detach_controller nvme0


Virtio-blk Dynamic Configuration For 125 VFs

  1. Update the firmware configuration as described section "SR-IOV Firmware Configuration".

  2. Reboot the host.

  3. Run:

    Copy
    Copied!
                

    [dpu] spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage [dpu] snap_rpc.py virtio_blk_controller_create --vuid MT2114X12200VBLKS1D0F0 [dpu] for i in `seq 0 124`; do snap_rpc.py virtio_blk_controller_create --pf_id 0 --vf_id $i --bdev nvme0n1; done;   [host] modprobe -v virtio-pci && modprobe -v virtio-blk [host] echo 125 > /sys/bus/pci/devices/0000:86:00.3/sriov_numvfs

    Note

    When SR-IOV is enabled, it is recommended to destroy virtio-blk controllers on VFs using the following and not the virito_blk_controller_destroy RPC command:

    Copy
    Copied!
                

    [host] echo 0 > /sys/bus/pci/devices/0000:86:00.3/sriov_numvfs

    To destroy a single virtio-blk controller, run:

    Copy
    Copied!
                

    [dpu] ./snap_rpc.py -t 1000 virtio_blk_controller_destroy -c VblkCtrl5 –f

Virtio-blk Suspend, Resume Example

Copy
Copied!
            

[host] // Run fio [dpu] snap_rpc.py virtio_blk_controller_suspend -c VBLKCtrl1 [host] // IOs will get suspended [dpu] snap_rpc.py virtio_blk_controller_resume -c VBLKCtrl1 [host] // fio will resume sending IOs


Virtio-blk Bdev Attach, Detach Example

Copy
Copied!
            

[host] // Run fio [dpu] snap_rpc.py virtio_blk_controller_bdev_detach -c VBLKCtrl1 [host] // Bdev will be detached and IOs will get suspended [dpu] snap_rpc.py virtio_blk_controller_bdev_attach -c VBLKCtrl1 --bdev null2 [host] // The null2 bdev will be attached into controller and fio will resume sending IOs


Notes

  • Virtio-blk protocol controller supports one backend device only

  • Virtio-blk protocol does not support administration commands to add backends. Thus, all backend attributes are communicated to the host virtio-blk driver over PCIe BAR and must be accessible during driver probing. Therefore, backends can only be changed once the PCIe function is not in use by any host storage driver.

NVMe Subsystem

The NVMe subsystem as described in the NVMe specification is a logical entity which encapsulates sets of NVMe backends (or namespaces) and connections (or controllers). NVMe subsystems are extremely useful when working with multiple NVMe controllers especially when using NVMe VFs. Each NVMe subsystem is defined by its serial number (SN), model number (MN), and qualified name (NQN) after creation.

The RPCs listed in this section control the creation and destruction of NVMe subsystems.

NVMe Namespace

NVMe namespaces are the representors of a continuous range of LBAs in the local/remote storage. Each namespace must be linked to a subsystem and have a unique identifier (NSID) across the entire NVMe subsystem (e.g., 2 namespaces cannot share the same NSID even if they are linked to different controllers).

After creation, NVMe namespaces can be attached to a controller.

Note

SNAP does not currently support shared namespaces between different controllers. So, each namespace should be attached to a single controller.

The SNAP application uses an SPDK block device framework as a backend for its NVMe namespaces. Therefore, they should be configured in advance. For more information about SPDK block devices, see SPDK bdev documentation and Appendix SPDK Configuration.

NVMe Controller

Each NVMe device (e.g., NVMe PCIe entry) exposed to the host, whether it is a PF or VF, must be backed by NVMe controller, which is responsible for all protocol communication with the host's driver.

Every new NVMe controller must also be linked to an NVMe subsystem. After creation, NVMe controllers can be addressed using either their name (e.g., Nvmectrl1) or both their subsystem NQN and controller ID.

Attaching NVMe Namespace to NVMe Controller

After creating an NVMe controller and an NVMe namespace under the same subsystem, the following method is used to attach the namespace to the controller.

NVMe Emulation Management Commands

nvme_subsystem_create

Create a new NVMe subsystem to be controlled by one or more NVMe SNAP controllers. An NVMe subsystem includes one or more controllers, zero or more namespaces, and one or more ports. An NVMe subsystem may include a non-volatile memory storage medium and an interface between the controller(s) in the NVMe subsystem and non-volatile memory storage medium.

Command parameters:

Parameter

Mandatory?

Type

Description

nqn

Yes

String

Subsystem qualified name

serial_number

No

String

Subsystem serial number (Max string length is 20 characters)

model_number

No

String

Subsystem model number

nn

No

Number

Maximal namespace ID allowed in the subsystem (default 0xFFFFFFFE; range 1-0xFFFFFFFE)

mnan

No

Number

Maximal number of namespaces allowed in the subsystem (default 1024; range 1-0xFFFFFFFE)

single_ctrl

No

Boolean

Subsystem is restricted to a single ctrl

Example request:

Copy
Copied!
            

{ "jsonrpc": "2.0", "id": 1, "method": "nvme_subsystem_create", "params": { "nqn": "nqn.2022-10.io.nvda.nvme:0" } }


nvme_subsystem_destroy

Destroy (previously created) NVMe SNAP subsystem.

Command parameters:

Parameter

Mandatory?

Type

Description

nqn

Yes

String

Subsystem qualified name

force

No

Bool

Force the deletion of all the controllers and namespaces under the subsystem


nvme_subsystem_list

List NVMe subsystems.

nvme_namespace_create

Create new NVMe namespaces that represent a continuous range of LBAs in the previously configured bdev. Each namespace must be linked to a subsystem and have a unique identifier (NSID) across the entire NVMe subsystem.

Command parameters:

Parameter

Mandatory?

Type

Description

nqn

Yes

String

Subsystem qualified name

bdev_name

Yes

String

Block device to use as backend

nsid

Yes

Number

Namespace ID

uuid

No

Number

Namespace UUID

Note

To safely detach/attach namespaces, the UUID should be provided to force the UUID to remain persistent.

dbg_bdev_type

No

string

Bdev plugin to be attached with the controller. Refer to section "Bdev" for more information.


nvme_namespace_destroy

Destroy a previously created NVMe namespaces.

Command parameters:

Parameter

Mandatory?

Type

Description

nqn

Yes

String

Subsystem qualified name

nsid

Yes

Number

Namespace ID


nvme_namespace_list

List NVMe SNAP namespaces.

Command parameters:

Parameter

Mandatory?

Type

Description

nqn

No

String

Subsystem qualified name


nvme_controller_create

Creates a new SNAP-based NVMe block controller over a specific PCIe function on the host.

To specify the PCIe function to open the controller upon, pci_index must be provided. The mapping for pci_index can be queried by running emulation_function_list.

Command parameters:

Parameter

Mandatory?

Type

Description

nqn

Yes

String

Subsystem qualified name.

vuid

No

Number

VUID of PCIe function.

pf_id

No

Number

PCIe PF index to start emulation on.

vf_id

No

Number

PCIe VF index to start emulation on (if the controller is destined to be opened on a VF).

pci_bdf

No

String

PCIe BDF to start emulation on.

vhca_id

No

Number

vHCA ID of PCIe function.

ctrl

No

Number

Controller ID.

num_queues

No

Number

Number of IO queues (default: 1, range: 1–31).

Note

Limited by hardware support.

Tip

Recommended for the number of MSIX to be greater than the number of IO queues.

mdts

No

Number

MDTS (default: 7, range: 1–7).

fw_slots

No

Number

Maximum number of firmware slots (default: 4).

write_zeroes

No

0/1

Enable the write_zeroes optional NVMe command.

compare

No

0/1

Set the value of the compare support bit in the controller.

compare_write

No

0/1

Set the value of the compare_write support bit in the controller.

Note

During crash recovery, all compare and write commands are expected to fail.

deallocate_dsm

No

0/1

Set the value of the DSM (dataset management) support bit in the controller. Only the deallocate DSM request is currently supported.

suspended

No

0/1

Open the controller in suspended state (requires nvme_controller_resume to become active). Recommended for recovery scenarios or when creating the controller while the driver is loaded.

snapshot

No

String

Create a controller from a snapshot file path (snapshot must be taken using nvme_controller_snapshot_get).

dynamic_msix

No

0/1

Enable dynamic MSIX management for the controller (default: 0). Applies only for PFs.

vf_num_msix

No

Number

Number of MSIX tables to associate with this controller. Valid only for VFs whose parent PF controller was created with --dynamic_msix, and when dynamic MSIX management is enabled.

Note

Mandatory if VF's MSIX is reclaimed using nvme_controller_vfs_msix_reclaim or released using --release_msix on nvme_controller_destroy.

admin_only

No

0/1

Creates NVMe controller with admin queues only (no IO queues).

quirks

No

Number

Bitmask to support buggy/non-compliant drivers.

  • Bit 0 – Send "Namespace Attribute Changed" async event, even if disabled by driver.

  • Bit 1 – Keep sending these events even without a "Changed Namespace List" Get Log Page request.

  • Bit 2 – reserved

  • Bit 3 – Force-enable "Namespace Management capability" NVMe OACS even if unsupported.

  • Bit 4 – Disable Scatter-Gather Lists support. See “OS Issues” section for details.

queue_core_alloc_policy

No

String

Core allocation policy: "pre-allocation" (predictable but may be imbalanced if active queues differ from configured), or "on-demand" (auto-balances at runtime, unpredictable upfront). Default: "on-demand".

Note

Core-to-queue mapping is not persistent across recovery/live update/migration.

admin_identify_cmd_fwd

No

String

Enable forwarding of Identify admin commands with selected CNS values (comma-separated hex with 0x prefix) from SNAP to the upper application. Example:

Copy
Copied!
            

--admin_identify_cmd_fwd 0x0,0x1,0x13

admin_set_features_cmd_fwd

No

String

Enable forwarding of Set Features admin commands with selected FID values (comma-separated hex with 0x prefix) from SNAP to the upper application. Example:

Copy
Copied!
            

--admin_set_features_cmd_fwd 0xc0,0xe4

Note

If not explicitly configured, the SNAP NVMe controller enables an optional NVMe command only when all namespaces attached at driver load time support it. To override this behavior, set the corresponding optional command support flag explicitly. For example, creating a controller with --compare 0 disables the Compare command, even if all attached namespaces support it.

Example request:

Copy
Copied!
            

{ "jsonrpc": "2.0", "id": 1, "method": "nvme_controller_create", "params": { "nqn": "nqn.2022-10.io.nvda.nvme:0", "pf_id": 0, "num_queues": 8, } }


nvme_controller_destroy

Destroy a previously created NVMe controller. The controller can be uniquely identified by a controller name as acquired from nvme_controller_create.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

release_msix

No

1/0

Release MSIX back to free pool. Applies only for VFs.

force

No

1/0

Forcefully destroy the controller (without this flag set, SNAP may block deletion in case driver is up).

Note

After controller is destroyed with --force flag set, it is not guaranteed SNAP will perform a successful recovery after this. It is the user's responsibility to make sure either driver is properly cleaned up or controller is re-created with exact same configuration, before resuming SNAP controller back.


nvme_controller_suspend

While suspended, the controller stops handling new requests from the host driver. All pending requests (if any) will be processed after resume.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

timeout_ms

No

Number

Suspend timeout

Note

The maximum time in milliseconds to let the controller finish all the inflight I/Os before discarding them.

If timeout_ms is not provided, the operation waits until the IOs complete without a timeout on the SNAP layer.

admin_only

No

0/1

Suspend only the admin queue

live_update_notifier

No

0/1

Send a live update notification via IPC


nvme_controller_resume

The resume command continues the (previously-suspended) controller's handling of new requests sent by the driver. If the controller is created in suspended mode, resume is also used to start initial communication with host driver.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

live_update

No

0/1

Live update resume


nvme_controller_snapshot_get

Take a snapshot of the current state of the controller and dump it into a file. This file may be used to create a controller based on this snapshot. For the snapshot to be consistent, users should call this function only when the controller is suspended (see nvme_controller_suspend RPC).

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

filename

Yes

String

File path


nvme_controller_vfs_msix_reclaim

Reclaims all VFs MSIX back to the PF's free MSIX pool.

This function can only be applied on PFs and can only be run when SR-IOV is not set on host side (i.e., sriov_numvfs = 0).

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name


nvme_controller_list

Provide a list of all active (created) NVMe controllers with their characteristics.

Command parameters:

Parameter

Mandatory?

Type

Description

nqn

No

String

Subsystem qualified name

ctrl

No

String

Only search for a specific controller


nvme_controller_modify

This function allows user to modify some of the controller's parameters in real-time, after it was already created.

Modifications can only be done when the emulated function is in idle state - thus there is no driver communicating with it.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

No

String

Controller Name

num_queues

No

int

Number of queues for the controller

num_msix

No

int

Number of MSIX to be used for a controller.

Relevant only for VF controllers (when dynamic MSIX feature is enabled).


nvme_controller_attach_ns

Attach a previously created NVMe namespace to given NVMe controller under the same subsystem.

The result in the response object returns true for success and false for failure.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

nsid

Yes

Number

Namespace ID


nvme_controller_detach_ns

Detach a previously attached namespace with a given NSID from the NVMe controller.

The result in the response object returns true for success and false for failure.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

nsid

Yes

Number

Namespace ID


nvme_controller_dbg_debug_stats_get

Display debug counters for NVMe controller as following:

BAR Registers

Register

Description

CC

Controller Configuration register.

CSTS

Controller Status register.


Submission Queue (SQS) Fields

Field

Description

qid

Queue ID.

depth

Queue depth.

cqid

Associated completion queue ID.

state

Current queue state.

driver_sq_tail

Value of the SQ Tail Doorbell register as written by the driver.

device_sq_head

Value of the SQ Head pointer within the device.

Requests

Total number of requests received on this queue.


Completion Queue (CQS) Fields

Field

Description

qid

Queue ID.

depth

Queue depth.

state

Current queue state.

driver_cq_head

Value of the CQ Head Doorbell register as written by the driver.

device_cq_tail

Value of the CQ Tail pointer within the device.

Completions

Total number of completions sent on this queue.


Command Parameters

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

nvme_controller_dbg_io_stats_get

The result in the response object returns true for success and false for failure.

Command parameters:

Parameter

Mandatory?

Type

Description

ctrl

Yes

String

Controller name

Copy
Copied!
            

"ctrl_id": "NVMeCtrl2", "queues": [ { "queue_id": 0, "core_id": 0, "read_io_count": 19987068, "write_io_count": 6319931, "flush_io_count": 0 }, { "queue_id": 1, "core_id": 1, "read_io_count": 9769556, "write_io_count": 3180098, "flush_io_count": 0 } ], "read_io_count": 29756624, "write_io_count": 9500029, "flush_io_count": 0 }

NVMe Configuration Examples

NVMe Configuration for Single Controller

On the DPU:

Copy
Copied!
            

spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2022-10.io.nvda.nvme:swx-storage snap_rpc.py nvme_subsystem_create --nqn nqn.2022-10.io.nvda.nvme:0 snap_rpc.py nvme_namespace_create -b nvme0n1 -n 1 --nqn nqn.2022-10.io.nvda.nvme:0 --uuid 263826ad-19a3-4feb-bc25-4bc81ee7749e snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --pf_id 0 --suspended snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl1 -n 1 snap_rpc.py nvme_controller_resume -c NVMeCtrl1

Note

It is necessary to create a controller in a suspended state. Afterward, the namespaces can be attached, and only then should the controller be resumed using the nvme_controller_resume RPC.

Note

To safely detach/attach namespaces, the UUID must be provided to force the UUID to remain persistent.


NVMe Cleanup for Single Controller

Copy
Copied!
            

snap_rpc.py nvme_controller_detach_ns -c NVMeCtrl2 -n 1 snap_rpc.py nvme_controller_destroy -c NVMeCtrl2 snap_rpc.py nvme_namespace_destroy -n 1 --nqn nqn.2022-10.io.nvda.nvme:0 snap_rpc.py nvme_subsystem_destroy --nqn nqn.2022-10.io.nvda.nvme:0 spdk_rpc.py bdev_nvme_detach_controller nvme0


NVMe and Hotplug Cleanup for Single Controller

Copy
Copied!
            

snap_rpc.py nvme_controller_detach_ns -c NVMeCtrl1 -n 1 snap_rpc.py emulation_device_detach_prepare --vuid MT240830045RNVMES1D0F0 snap_rpc.py nvme_controller_destroy -c NVMeCtrl1 snap_rpc.py emulation_device_detach --vuid MT240830045RNVMES1D0F0 snap_rpc.py nvme_namespace_destroy -n 1 --nqn nqn.2022-10.io.nvda.nvme:0 snap_rpc.py nvme_subsystem_destroy --nqn nqn.2022-10.io.nvda.nvme:0 spdk_rpc.py bdev_nvme_detach_controller nvme0


NVMe Configuration for 125 VFs SR-IOV

  1. Update the firmware configuration as described section "SR-IOV Firmware Configuration".

  2. Reboot the host.

  3. Create a dummy controller on the parent PF:

    Copy
    Copied!
                

    [dpu] # snap_rpc.py nvme_subsystem_create --nqn nqn.2022-10.io.nvda.nvme:0 [dpu] # snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --ctrl NVMeCtrl1 --pf_id 0 --admin_only

  4. Create 125 Bdevs (Remote or Local), 125 NSs and 125 controllers:

    Copy
    Copied!
                

    [dpu] for i in `seq 0 124`; do \ # spdk_rpc.py bdev_null_create null$((i+1)) 64 512; # snap_rpc.py nvme_namespace_create -b null$((i+1)) -n $((i+1)) --nqn nqn.2022-10.io.nvda.nvme:0 --uuid 3d9c3b54-5c31-410a-b4f0-7cf2afd9e$((i+100));   # snap_rpc.py nvme_controller_create --nqn nqn.2022-10.io.nvda.nvme:0 --ctrl NVMeCtrl$((i+2)) --pf_id 0 --vf_id $i --suspended; # snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl$((i+2)) -n $((i+1)); # snap_rpc.py nvme_controller_resume -c NVMeCtrl$(i+2); done

  5. Load the driver and configure VFs:

    Copy
    Copied!
                

    [host] # modprobe -v nvme [host] # echo 125 > /sys/bus/pci/devices/0000\:25\:00.2/sriov_numvfs

snap_global_param_list

snap_global_param_list lists all existing environment variables.

The following is an example response for the snap_global_param_list command:

Copy
Copied!
            

[ "SNAP_ENABLE_POLL_SKIP : set : 0 ", "SNAP_POLL_CYCLE_SIZE : not set : 16 ", "SNAP_RPC_LOG_ENABLE : set : 1 ", "SNAP_MEMPOOL_SIZE_MB : set : 1024", "SNAP_MEMPOOL_4K_BUFFS_PER_CORE : not set : 1024", "SNAP_RDMA_ZCOPY_ENABLE : set : 1 ", "SNAP_TCP_XLIO_ENABLE : not set : 1 ", "SNAP_TCP_XLIO_TX_ZCOPY : not set : 1 ", "MLX5_SHUT_UP_BF : not set : 0 ", "SNAP_SHARED_RX_CQ : not set : 1 ", "SNAP_SHARED_TX_CQ : not set : 1 ", ...


© Copyright 2025, NVIDIA. Last updated on Aug 25, 2025.