SNAP-4 Service Advanced Features

RPC Log History

RPC log history (enabled by default) records all the RPC requests (from snap_rpc.py and spdk_rpc.py) sent to the SNAP application and the RPC response for each RPC requests in a dedicated log file, /var/log/snap-log/rpc-log. This file is visible outside the container (i.e., the log file's path on the DPU is /var/log/snap-log/rpc-log as well).

The SNAP_RPC_LOG_ENABLE env can be used to enable (1) or disable (0) this feature.

Info

RPC log history is supported with SPDK version spdk23.01.2-12 and above.

Warning

When RPC log history is enabled, the SNAP application writes (in append mode) RPC request and response message to /var/log/snap-log/rpc-log constantly. Pay attention to the size of this file. If it gets too large, delete the file on the DPU before launching the SNAP pod.

SR-IOV

SR-IOV configuration depends on the kernel version and must be handled carefully to ensure device visibility and system stability across both hypervisor and DPU orchestrators.

To ensure a safe and stable SR-IOV setup, follow these steps:

Preconfigure VF controllers on the DPU – Before configuring SR-IOV on the host, ensure that the DPU is properly configured with all required VF controllers already created and opened.
- VF functions are always visible and configurable on the DPU side. Use the following command to verify:
  Copy
  
  Copied!
```
            
            snap_rpc.py emulation_function_list --all
        
```
- Confirm that the configuration meets your requirements.
  - Check the number of resources allocated for the PF, specifically MSIX and queues (queried using snap_rpc.py virtio_blk_controller_list RPC output, field free_queues and free_msix ) are enough to satisfy your needs for all underlying VFs.
  - Use dynamic MSIX if needed and supported.
- Once host-side configuration begins, further modifications may not be possible.
Disable autoprobing with sriov_drivers_autoprobe=0 – In deployments with many virtual devices, autoprobing must be disabled to ensure stable device discovery. Failing to disable autoprobing may result in:
- Incomplete device visibility
- Missing virtual disks
- System hangs during initialization
- Unreliable behavior in large-scale environments (more than 100 VFs)
  Tip
  Recommended configuration for large-scale deployments:
  - Disable autoprobe:
    
    Copy
    
    Copied!
    
    echo 0 > /sys/bus/pci/devices/<BDF>/sriov_drivers_autoprobe
  - Manually bind the VFs to drivers using tools such as driverctl, or by writing to bind/unbind in sysfs.
Configure SR-IOV on the host – For small-scale deployments (fewer than 100 VFs), use the sriov_totalvfs entry:
Copy

Copied!
```
            
            echo <number_of_vfs> > /sys/bus/pci/devices/<BDF>/sriov_totalvfs
        
```
For newer drivers, use:
Copy

Copied!
```
            
            echo <number_of_vfs> > /sys/bus/pci/devices/<BDF>/sriov_numvfs
        
```
Note

After SR-IOV configuration, no disks appear in the hypervisor by default. Disks are only visible inside VMs once the corresponding PCIe VF is assigned to the VM via a virtualization manager (e.g., libvirt, VMware). To use the device directly from the hypervisor, manually bind the VF to the appropriate driver.

Additional notes:

Hot-plugged PFs do not support SR-IOV.

For deployments requiring more than 127 VFs, add the following kernel parameter to the host’s boot command line:

Copy
Copied!

            
            pci=assign-busses

Without this, the host may log errors such as:

Copy
Copied!

            
            pci 0000:84:00.0: [1af4:1041] type 7f class 0xffffff
pci 0000:84:00.0: unknown header type 7f, ignoring device

These errors prevent the virtio driver from probing the device.

Zero Copy (SNAP-direct)

Note

Zero-copy is supported on SPDK 21.07 and higher.

SNAP-direct allows SNAP applications to transfer data directly from the host memory to remote storage without using any staging buffer inside the DPU.

SNAP enables the feature according to the SPDK BDEV configuration only when working against an SPDK NVMe-oF RDMA block device.

To enable zero copy, set the environment variable (as it is enabled by default):

Copy
Copied!

            
            SNAP_RDMA_ZCOPY_ENABLE=1

For more info refer to SNAP-4 Service Environment Variables.

NVMe/TCP XLIO Zero Copy

NVMe/TCP Zero Copy is implemented as a custom NVDA_TCP transport in SPDK NVMe initiator, and it is based on a new XLIO socket layer implementation.

The implementation is different for Tx and Rx:

The NVMe/TCP Tx Zero Copy is similar between RDMA and TCP in that the data is sent from the host memory directly to the wire without an intermediate copy to Arm memory
The NVMe/TCP Rx Zero Copy allows achieving partial zero copy on the Rx flow by eliminating copy from socket buffers (XLIO) to application buffers (SNAP). But data still must be DMA'ed from Arm to host memory.

To enable NVMe/TCP Zero Copy, use SPDK v22.05.nvda --with-xlio (v22.05.nvda or higher).

Note

For more information about XLIO including limitations and bug fixes, refer to the NVIDIA Accelerated IO (XLIO) Documentation.

To enable SNAP TCP XLIO Zero Copy:

SNAP container: Set the environment variables and resources in the YAML file to request 6G of hugepages:

Copy
Copied!

            
            resources:
  requests:
    memory: "4Gi"
    cpu: "8"
  limits:
    hugepages-2Mi: "6Gi"
    memory: "6Gi"
    cpu: "16"
 
## Set according to the local setup
env:
  - name: APP_ARGS
    value: "--wait-for-rpc"
  - name: SPDK_XLIO_PATH
    value: "/usr/lib/libxlio.so"

SNAP sources: Set the environment variables and resources in the relevant scripts
1. In run_snap.sh, edit the APP_ARGS variable to use the SPDK command line argument --wait-for-rpc:
  run_snap.sh
  
  Copy
  
  Copied!
```
            
            APP_ARGS="--wait-for-rpc"
        
```
2. In set_environment_variables.sh, uncomment the SPDK_XLIO_PATH environment variable:
  set_environment_variables.sh
  
  Copy
  
  Copied!
```
            
            export SPDK_XLIO_PATH="/usr/lib/libxlio.so"
        
```

Note

NVMe/TCP XLIO requires a BlueField Arm OS hugepage size of 4Gi. For information on configuring the hugepages, refer to sections "Step 1: Allocate Hugepages" and "Adjusting YAML Configuration".

At high scale, it is required to use the global variable XLIO_RX_BUFS=4096 even though it leads to high memory consumption. Using XLIO_RX_BUFS=1024 requires lower memory consumption but limits the ability to scale the workload.

Info

For more info refer to SNAP-4 Service Environment Variables.

Tip

It is recommended to configure NVMe/TCP XLIO with the transport ack timeout option increased to 12.

Copy
Copied!

            
            [dpu] spdk_rpc.py bdev_nvme_set_options --transport-ack-timeout 12

Other bdev_nvme options may be adjusted according to requirements.

Expose an NVMe-oF subsystem with one namespace by using a TCP transport type on the remote SPDK target.

Copy
Copied!

            
            [dpu] spdk_rpc.py sock_set_default_impl -i xlio
[dpu] spdk_rpc.py framework_start_init
[dpu] spdk_rpc.py bdev_nvme_set_options --transport-ack-timeout 12
[dpu] spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t nvda_tcp -a 3.3.3.3 -f ipv4 -s 4420 -n nqn.2023-01.io.nvmet
[dpu] snap_rpc.py nvme_subsystem_create --nqn nqn.2023-01.com.nvda:nvme:0
[dpu] snap_rpc.py nvme_namespace_create -b nvme0n1 -n 1 --nqn nqn. 2023-01.com.nvda:nvme:0 --uuid 16dab065-ddc9-8a7a-108e-9a489254a839
[dpu] snap_rpc.py nvme_controller_create --nqn nqn.2023-01.com.nvda:nvme:0 --ctrl NVMeCtrl1 --pf_id 0 --suspended --num_queues 16
[dpu] snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl1 -n 1 
[dpu] snap_rpc.py nvme_controller_resume -c NVMeCtrl1 -n 1
 
[host] modprobe -v nvme
[host] fio --filename /dev/nvme0n1 --rw randrw --name=test-randrw --ioengine=libaio --iodepth=64 --bs=4k --direct=1 --numjobs=1 --runtime=63 --time_based --group_reporting --verify=md5

Info

For more information on XLIO, please refer to XLIO documentation.

Encryption

The SPDK version included with SNAP supports hardware encryption and decryption offload. To enable AES-XTS and authorize the mlx5_2 and mlx5_3 Scalable Functions (SFs) to support encryption, you must designate them as trusted entities.

Configure the trust level by editing the /etc/mellanox/mlnx-sf.conf file to include the following commands. This configures the trust level and creates the SFs with the required permissions:

Copy
Copied!

            
            /usr/bin/mlxreg -d 03:00.0 --reg_name VHCA_TRUST_LEVEL --yes --indexes "vhca_id=0x0,all_vhca=0x1" --set "trust_level=0x1"
/usr/bin/mlxreg -d 03:00.1 --reg_name VHCA_TRUST_LEVEL --yes --indexes "vhca_id=0x0,all_vhca=0x1" --set "trust_level=0x1"
/sbin/mlnx-sf --action create --device 0000:03:00.0 --sfnum 0 --hwaddr 02:11:3c:13:ad:82
/sbin/mlnx-sf --action create --device 0000:03:00.1 --sfnum 0 --hwaddr 02:76:78:b9:6f:52

Note

Ensure the PCIe device addresses (e.g., 03:00.0) match your specific hardware configuration.

Reboot the DPU to apply the new configuration.

Zero Copy (SNAP-direct) with Encryption

SNAP offers support for zero copy with encryption for bdev_nvme with an RDMA transport.

Note

If another bdev_nvme transport or base bdev other than NVMe is used, then zero copy flow is not supported, and additional DMA operations from the host to the BlueField Arm are performed.

Info

Refer to section "SPDK Crypto Example" to see how to configure zero copy flow with AES_XTS offload.

Command	Description
`mlx5_scan_accel_module`	Accepts a list of devices to be used for the crypto operation
`accel_crypto_key_create`	Creates a crypto key
`bdev_nvme_attach_controller`	Constructs NVMe block device
`bdev_crypto_create`	Creates a virtual block device which encrypts write IO commands and decrypts read IO commands

mlx5_scan_accel_module

Accepts a list of devices to use for the crypto operation provided in the --allowed-devs parameter. If no devices are specified, then the first device which supports encryption is used.

For best performance, it is recommended to use the devices with the largest InfiniBand MTU (4096). The MTU size can be verified using the ibv_devinfo command (look for the max and active MTU fields). Normally, the mlx5_2 device is expected to have an MTU of 4096 and should be used as an allowed crypto device.

Command parameters:

Parameter	Mandatory?	Type	Description
`qp-size`	No	Number	QP size
`num-requests`	No	Number	Size of the shared requests pool
`allowed-devs`	No	String	Comma-separated list of allowed device names (e.g., "mlx5_2") Note Make sure that the device used for RDMA traffic is selected to support zero copy.
`enable-driver`	No	Boolean	Enables accel_mlx5 platform driver. Allows AES_XTS RDMA zero copy.

accel_crypto_key_create

Creates crypto key. One key can be shared by multiple bdevs.

Command parameters:

Parameter	Mandatory?	Type	Description
`cipher`	Yes	Number	Crypto protocol (AES_XTS)
`key`	Yes	Number	Key
`key2`	Yes	Number	Key2
`name`	Yes	String	Key name

bdev_nvme_attach_controller

Creates NVMe block device.

Command parameters:

Parameter	Mandatory?	Type	Description
`name`	Yes	String	Name of the NVMe controller, prefix for each bdev name
`trtype`	Yes	String	NVMe-oF target trtype (e.g., rdma, pcie)
`traddr`	Yes	String	NVMe-oF target address (e.g., an IP address or BDF)
`trsvcid`	No	String	NVMe-oF target trsvcid (e.g., a port number)
`addrfam`	No	String	NVMe-oF target adrfam (e.g., ipv4, ipv6)
`nqn`	No	String	NVMe-oF target subnqn

bdev_crypto_create

This RPC creates a virtual crypto block device which adds encryption to the base block device.

Command parameters:

Parameter	Mandatory?	Type	Description
`base_bdev_name`	Yes	String	Name of the base bdev
`name`	Yes	String	Crypto bdev name
`key-name`	Yes	String	Name of the crypto key created with `accel_crypto_key_create`

SPDK Crypto Example

The following is an example of a configuration with a crypto virtual block device created on top of bdev_nvme with RDMA transport and zero copy support:

Copy
Copied!

            
            [dpu] # spdk_rpc.py mlx5_scan_accel_module --allowed-devs "mlx5_2" --enable-driver
[dpu] # spdk_rpc.py framework_start_init
[dpu] # spdk_rpc.py accel_crypto_key_create -c AES_XTS -k 00112233445566778899001122334455 -e 11223344556677889900112233445500 -n test_dek
[dpu] # spdk_rpc.py bdev_nvme_attach_controller -b nvme0 -t rdma -a 1.1.1.1 -f ipv4 -s 4420 -n nqn.2016-06.io.spdk:cnode0
[dpu] # spdk_rpc.py bdev_crypto_create nvme0n1 crypto_0 -n test_dek
[dpu] # snap_rpc.py nvme_subsystem_create --nqn nqn.2023-05.io.nvda.nvme:0
[dpu] # snap_rpc.py nvme_controller_create --nqn nqn.2023-05.io.nvda.nvme:0 --pf_id 0 --ctrl NVMeCtrl0 --suspended
[dpu] # snap_rpc.py nvme_namespace_create –nqn nqn.2023-05.io.nvda.nvme:0 --bdev_name crypto_0 –-nsid 1 -–uuid 263826ad-19a3-4feb-bc25-4bc81ee7749e
[dpu] # snap_rpc.py nvme_controller_attach_ns –-ctrl NVMeCtrl0 --nsid 1
[dpu] # snap_rpc.py nvme_controller_resume –-ctrl NVMeCtrl0

Virtio-blk Live Migration

Live migration is a standard process supported by QEMU which allows system administrators to pass devices between virtual machines in a live running system. For more information, refer to QEMU VFIO Device Migration documentation.

Live migration is supported for SNAP virtio-blk devices in legacy and standard VFIO modes. Legacy mode uses drivers like NVIDIA's proprietary vDPA-based Live Migration Solution, while standard mode leverages the latest kernel capabilities using the virtio-vfio-pci kernel driver. Legacy mode can be enabled/disabled using the environment variable `VIRTIO_CTRL_VDPA_ADMIN_Q` (enabled by default).

Note

In the standard virtio live migration process, the device is expected to complete all inflight I/Os, with no configurable timeout. If the remote storage is unavailable (disconnected or non-responsive), the device migration will wait indefinitely. This means migration time cannot be guaranteed, representing a degradation compared to the functionality of legacy mode.

Software Requirements for Standard VFIO

Kernel – 6.16-rc3+ (using virtio-vfio-pci driver)
QEMU – 9.2+
libvirt – 10.6+

SNAP Configuration

Set the environment variable VIRTIO_CTRL_VDPA_ADMIN_Q to 1 (default) for legacy or 0 for standard VFIO mode.

Create a PF Controller with Admin Queue (common to both modes):

Copy
Copied!

            
            snap_rpc.py virtio_blk_controller_create --admin_q …

SNAP Container Live Upgrade

Live Upgrade allows you to update the SNAP service to a newer version without interrupting active processes or causing downtime.

Not all deployment types support live upgrades. Use the matrix below to verify your upgrade path:

Current Deployment	Target Deployment	Support Status	Notes
Container	Container	Fully supported	Recommended path
Container	Source package	Fully supported	Supports moving between formats
Source package	Container	Fully supported	Supports moving between formats
Source package	Source package	Alpha level	See "Source Package Live Upgrade" section

Use Live Upgrade only for sub-version updates (e.g., 4.0.0-x to 4.0.0-y).

Info

Major or minor version jumps often require a full system reboot (due to firmware dependencies), making a live upgrade redundant.

Live Upgrade Prerequisites

To enable live upgrade, perform the following modifications:

Allocate double hugepages for the destination and source containers.
Make sure the requested amount of CPU cores is available.

The default YAML configuration sets the container to request a CPU core range of 8-16. This means that the container is not deployed if there are fewer than 8 available cores, and if there are 16 free cores, the container utilizes all 16.

For instance, if a container is currently using all 16 cores and, during a live upgrade, an additional SNAP container is deployed. In this case, each container uses 8 cores during the upgrade process. Once the source container is terminated, the destination container starts utilizing all 16 cores.

Note

For 8-core DPUs, the .yaml must be edited to the range of 4-8 CPU cores.
Change the name of the doca_snap.yaml file that describes the destination container (e.g., doca_snap_new.yaml ) so as to not overwrite the running container .yaml.
Change the name of the new .yaml pod and container on lines 16 and 20, respectively (e.g., snap-new).
Deploy the the destination container by copying the new yaml (e.g., doca_snap_new.yaml ) to kubelet.

Note

After deploying the destination container, until the live update process is complete, avoid making any configuration changes via RPC. Specifically, do not create or destroy hotplug functions.

Note

When restoring a controller in the destination container during a live update, it is recommended to use the same arguments originally used for controller configuration in the source container.

Note

User may need to update the RPC alias since the new container name has been edited.

Note

During the upgrade, ML Optimizer is automatically disabled on the source container. However, the destination container retains its ML Optimizer configuration and functionality.

Note

NVIDIA's default live_update.py script officially supports NVMe-oF/RDMA, Null, Malloc, and delay vbdev. The script is designed to be extensible, allowing users to add support for other devices if needed.

SNAP Container Live Upgrade Procedure

Follow the steps in section "Live Upgrade Prerequisites" and deploy the destination SNAP container using the modified yaml file.

Query the source and destination containers:

Copy
Copied!

            
            crictl ps -r

Check for SNAP started successfully in the logs of the destination container, then copy the live update from the container to your environment.

Copy
Copied!

            
            [dpu] crictl logs -f <dest-container-id>
[dpu] crictl exec <dest-container-id> cp /opt/nvidia/nvda_snap/bin/live_update.py /etc/nvda_snap/

Run the live_update.py script to move all active objects from the source container to the destination container:

Copy
Copied!

            
            [dpu] cd /etc/nvda_snap
[dpu] ./live_update.py -s <source-container-id> -d <dest-container-id>

After the script completes, the live update process also completes, delete the source container by removing the YAML from kubelet tool.
Note
To post RPCs, use the crictl tool:
Copy

Copied!

crictl exec -it <container-id X> snap_rpc.py <RPC-method> crictl exec -it <container-id Y> spdk_rpc.py <RPC-method>
Note

To automate the SNAP configuration (e.g., following failure or reboot) as explained in section "Automate SNAP Configuration (Optional)", spdk_rpc_init.conf and snap_rpc_init.conf must not include any configs as part of the live upgrade. Then, once the transition to the new container is done, spdk_rpc_init.conf and snap_rpc_init.conf can be modified with the desired configuration.

SNAP Container Live Upgrade Commands

The live upgrade process allows moving SNAP controllers and SPDK block devices between containers while minimizing host VM disruption.

live_update-version-1-modificationdate-1763018951050-api-v2.png

The upgrade is done using a dedicated live update tool, which iterates over all active emulation functions and performs the following steps:

Suspend controller (admin only). On the source container, suspend the controller to admin-only mode. This ensures the controller no longer processes admin commands from the host driver, avoiding state changes during the handover. I/O traffic continues, so downtime has not started yet.
- NVMe example:
  NMVe exmaple:
  
  Copy
  
  Copied!
```
            
            snap_rpc.py nvme_controller_suspend --ctrl NVMeCtrl0VF0 --admin_only
        
```
- Virtio-blk example:
  Virtio-blk Example
  
  Copy
  
  Copied!
```
            
            snap_rpc.py virtio_blk_controller_suspend --ctrl [ctrl_name] --events_only
        
```

Preparation on destination container. On the destination container, create all required objects for the new controller, including attaching the backend device.

NVMe example:

NMVe exmaple:

Copy
Copied!

            
            spdk_rpc.py bdev_nvme_attach_controller ...
snap_rpc.py nvme_subsystem_create ...
snap_rpc.py nvme_namespace_create -n 1 ...

Virtio-blk example:

Virtio-blk Example

Copy
Copied!

            
            spdk_rpc.py bdev_nvme_attach_controller ...

Create suspended controller (as listener). On the destination container, create the controller in a suspended state and mark it as a listener for a live update notification from the source container. At this point, the controller in the source container is still handling I/O, so downtime has not started yet.
- NVMe example:
  NMVe exmaple:
  
  Copy
  
  Copied!
```
            
            snap_rpc.py nvme_controller_create --pf_id 0 --vf_id 0 --ctrl NVMeCtrl0VF0 --live_update_listener --suspended ...
snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl0VF0 -n 1
        
```
- Virtio-blk example:
  Virtio-blk Example
  
  Copy
  
  Copied!
```
            
            snap_rpc.py virtio_blk_controller_create  --pf_id 0 --vf_id 0 --ctrl VBLKCtrl0VF0 ...
        
```
Suspend and notify. On the source container, suspend the controller using the --live_update_notifier flag. This triggers the start of downtime and sends a notification to the destination container. Once suspended, the controller on the destination container resumes and starts handling I/O. This marks the end of downtime.
- NVMe example:
  Copy
  
  Copied!
```
            
            snap_rpc.py nvme_controller_suspend --ctrl NVMeCtrl0VF0 --live_update_notifier --timeout_ms
        
```
- Virtio-blk example:
  Copy
  
  Copied!
```
            
            snap_rpc.py virtio_blk_controller_suspend --ctrl [ctrl_name] --events_only
        
```

Cleanup source container. After the migration is complete, clean up any remaining controller objects on the source container.

Note

The PF controller must remain present in the source container until all related virtual functions (VFs) have been removed.

NVMe example:

Copy
Copied!

            
            snap_rpc.py nvme_controller_detach_ns ...
spdk_rpc.py bdev_nvme_detach_controller ...
snap_rpc.py nvme_namespace_destroy ...
snap_rpc.py nvme_controller_destroy ...

Virtio-blk example:

Copy
Copied!

            
            snap_rpc.py virtio_blk_controller_suspend --ctrl [ctrl_name] --events_only

Shared Memory Pool Live Upgrade

Standard live upgrades require double the hugepages of a single instance to run both the old and new containers simultaneously. For example, if a single SNAP process requires 2GB (1GB SNAP + 1GB SPDK), a standard live upgrade requires 4GB.

Shared Memory Pool reduces this requirement by sharing a portion of memory between processes.

Standard Requirement: 4GB (2GB Old + 2GB New)
Shared Memory Requirement: 3GB (1GB Old + 1GB New + 1GB Shared)

SNAP splits its memory into two pools:

Base Mempool: Dedicated to a specific process (Private).
Extended Mempool: Can be freed and re-allocated dynamically (Shared).

Enabling Shared Memory Pool

To enable this, configure the environment variables in your YAML file. The SIZE must be larger than the BASE_SIZE.

Copy
Copied!

            
            # Example: 1GB Total, with 512MB reserved as Base
export SNAP_MEMPOOL_SIZE_MB=1024
export SNAP_MEMPOOL_BASE_SIZE_MB=512

Shared Memory Pool Execution Workflow

This flow utilizes two specific RPCs: memory_manager_deallocate_basepool and memory_manager_allocate_extpool.

Note

If the source container is killed, restarted, or recovers unexpectedly during this flow, the state is reset. You must fully restart the instance and begin the procedure from Phase 1.

Phase 1: Preparation (Source Container)

Call the deallocation RPC on the currently running (source) container to free up the shared pool.

Copy
Copied!

            
            [dpu] snap_rpc.py memory_manager_deallocate_basepool

Phase 2: Deployment (Destination Container)

Deploy the destination container. It will initially launch using only its Base Mempool.

Execute the standard live update script to transfer control.

Copy
Copied!

            
            [dpu] ./live_update.py ...

Phase 3: Cleanup & Re-allocation

Once the script finishes, destroy the source container.

Call the allocation RPC on the new container to claim the now-free extended memory.

Copy
Copied!

            
            [dpu] snap_rpc.py memory_manager_allocate_extpool

Source Package Live Upgrade

The live_update.py script supports live-upgrading from a SNAP container to a source package SNAP (or vice versa). It also includes an alpha-level capability to upgrade between two source package SNAP instances.

Important command-line arguments:

-s 0 / -d 0: Indicates the source/destination is a local source package (not a container).
--source_socket / --destination_socket: Flags indicating that the -s or -d argument is a specific UNIX socket path (used for the alpha-level source-to-source upgrade).

Scenario A: SNAP Container ➔ Source Package

Migrating from a containerized deployment to a raw source package running natively:

Allocate double hugepages (for both source and destination).
Prepare shell environment:
1. Run ulimit -l unlimited to allow memory locking.
2. If your container used custom environment variables (e.g., SNAP_MEMPOOL_SIZE_MB), you must export them in this shell before starting the process.
Initiate the source package SNAP process.

Note

Do not use the -r command line argument here.
Check for "SNAP started successfully" in the destination output.

Execute the script from the destination package location:

Copy
Copied!

            
            [dpu] ./live_update.py -s <source-container-id> -d 0

Remove the old container YAML from kubelet.

Scenario B: Source Package ➔ SNAP Container

Migrating from a raw source package to a containerized deployment:

Allocate double hugepages (for both source and destination).
Deploy the new doca_snap.yaml via kubelet.

Retrieve the new container ID.

Copy
Copied!

            
            [dpu] crictl ps -r

Copy the update script from the new container to your host environment.

Copy
Copied!

            
            [dpu] crictl exec <dest-container-id> cp /opt/nvidia/nvda_snap/bin/live_update.py /etc/nvda_snap/

Run the upgrade script.

Copy
Copied!

            
            [dpu] cd /etc/nvda_snap
[dpu] ./live_update.py -s 0 -d <dest-container-id>

Info

If the source package was originally started with -r (RPC socket), the standard command may fail. In this case, use the specific socket flag:

Copy
Copied!

            
            ./live_update.py --source_socket -s <socket-path> -d <dest-container-id>

Terminate the source package instance.

Scenario C: Source Package ➔ Source Package

Note

This is an Alpha-level feature for advanced users.

This mode requires running two SNAP binaries on the same host simultaneously. To prevent conflicts, they must use different resources.

Prerequisites:

In the destination shell. run:

Copy
Copied!

            
            ulimit -l unlimited

Export any required variables (e.g., SNAP_MEMPOOL_SIZE_MB).
When starting the Destination process, you must use unique values for:
- Core mask (-m): Must effectively be non-overlapping with the source.
  - Example: Source uses 0xFF00; Destination must use 0x00FF.
  - Constraint: Neither instance can use the default "all cores."
- RPC socket (-r): Must be a unique path (e.g., /var/tmp/snap_dest.sock).

Execution steps:

Launch the new SNAP binary with the unique -m and -r flags.
Wait for "SNAP started successfully".

Point the script to both specific sockets:

Copy
Copied!

            
            [dpu] ./live_update.py \
  -s <Source-Socket-Path> --source_socket \
  -d <Dest-Socket-Path> --destination_socket

Terminate the source process.

Configuration constraints:

Ensure spdk_rpc_init.conf and snap_rpc_init.conf are empty during the transition. Populate them only after the upgrade is complete.
Do not issue any RPC commands (especially hotplug functions) to the destination until the script completes.

SR-IOV Dynamic MSI-X and queues Management

Message Signaled Interrupts eXtended (MSIX) is an interrupt mechanism that allows devices to use multiple interrupt vectors, providing more efficient interrupt handling than traditional interrupt mechanisms such as shared interrupts. In Linux, MSIX is supported in the kernel and is commonly used for high-performance devices such as network adapters, storage controllers, and graphics cards. MSIX provides benefits such as reduced CPU utilization, improved device performance, and better scalability, making it a popular choice for modern hardware.

However, proper configuration and management of MSIX interrupts can be challenging and requires careful tuning to achieve optimal performance, especially in a multi-function environment as SR-IOV.

By default, BlueField distributes MSIX vectors evenly between all virtual PCIe functions (VFs). This approach is not optimal as users may choose to attach VFs to different VMs, each with a different number of resources. Dynamic MSIX management allows the user to manually control of the number of MSIX vectors provided per each VF independently.

Note

Configuration and behavior are similar for all emulation types, and specifically NVMe and virtio-blk.

Dynamic MSIX management is built from several configuration steps:

At this point, and in any other time in the future when no VF controllers are opened (sriov_numvfs=0), all PF-related MSIX vectors can be reclaimed from the VFs to the PF's free MSIX pool.
User must take some of the MSIX from the free pool and give them to a certain VF during VF controller creation.
When destroying a VF controller, the user may choose to release its MSIX back to the pool.

Once configured, the MSIX link to the VFs remains persistent and may change only in the following scenarios:

User explicitly requests to return VF MSIXs back to the pool during controller destruction.
PF explicitly reclaims all VF MSIXs back to the pool.
Arm reboot (FE reset/cold boot) has occurred.

To emphasize, the following scenarios do not change MSIX configuration:

Application restart/crash.
Closing and reopening PF/VFs without dynamic MSIX support.

The following is an NVMe example of dynamic MSIX configuration steps (similar configuration also applies for virtio-blk):

Reclaim all MSIX from VFs to PF's free MSIX pool:

Copy
Copied!

            
            snap_rpc.py nvme_controller_vfs_msix_reclaim <CtrlName>

Query the controller list to get information about the resources constraints for the PF:
Copy

Copied!
```
            
            # snap_rpc.py nvme_controller_list -c <CtrlName>
…     'free_msix': 100,
…     'free_queues': 200,
…     'vf_min_msix': 2,
…     'vf_max_msix': 64,
…     'vf_min_queues': 0,
…     'vf_max_queues': 31,
…
        
```
Where:
- free_msix stands for the number of total MSIX available in the PF's free pool, to be assigned for VFs, through the parameter vf_num_msix (of the <protocol>_controller_create RPC).
- free_queues stands for the number of total queues (or "doorbells") available in the PF's free pool, to be assigned for VFs, through the parameter num_queues (of the <protocol>_controller_create RPC).
- vf_min_msix and vf_max_msix together define the available configurable range of vf_num_msix parameter value which can be passed in <protocol>_controller_create RPC for each VF.
- vf_min_queues and vf_max_queues together define the available configurable range of num_queues parameter value which can be passed in <protocol>_controller_create RPC for each VF.
Distribute MSIX between VFs during their creation process, considering the PF's limitations:
Copy

Copied!
```
            
            snap_rpc.py nvme_controller_create_ --vf_num_msix <n> --num_queues <m> …
        
```
Note

It is strongly recommended to specify both vf_num_msix and num_queues when creating a VF controller. Supplying only one of these parameters can cause a mismatch between MSI-X allocation and queue configuration, potentially leading to controller or driver malfunctions.

Tip

In the NVMe protocol, MSI-X vectors are allocated per completion queue (CQ). Therefore, it is recommended to assign one MSI-X from the PF's global pool (free_msix) for each queue. The num_queues RPC flag is equal to the number of IO completion queues, and there's one additional admin completion queue.

In the virtio protocol, MSI-X vectors are allocated per virtqueue, with one additional MSI-X required for BAR configuration change notifications. It is therefore recommended to assign one MSI-X from the PF's global pool for each queue, plus one additional MSI-X for configuration notifications.

In summary, the best practice for queues/MSIX ratio configuration is num_queues = vf_num_msix - 1.

Upon VF teardown, release MSIX back to the free pool:

Copy
Copied!

            
            snap_rpc.py nvme_controller_destroy_ --release_msix …

Set SR-IOV on the host driver:
Copy

Copied!
```
            
            echo <N> > /sys/bus/pci/devices/<BDF>/sriov_numvfs
        
```
Note

It is strongly recommended to open all VF controllers in SNAP before binding VFs to the host or guest driver. This ensures that, in the event of a configuration error (e.g., insufficient MSIX or queue resources for all VFs) the configuration remains reversible, as the resources are still modifiable.

If VFs are bound to the driver before all VF configuration is complete, the driver may attempt to use resources from already configured VFs. However, if resources are insufficient, the driver will not be able to utilize them all. This scenario can lead to a host deadlock, which in the worst case may only be recoverable through a cold boot.
Note
There are several ways to configure dynamic MSIX safely (without VF binding):
1. Disable kernel driver automatic VF binding to kernel driver:
  Copy
  
  Copied!
  
  # echo 0 > /sys/bus/pci/devices/<BDF>/sriov_driver_autoprobe
  After finishing MSIX configuration for all VFs, they can then be bound to VMs, or even back to the hypervisor:
  Copy
  
  Copied!
  
  echo "0000:01:00.0" > /sys/bus/pci/drivers/nvme/bind
2. Use VFIO driver (instead of kernel driver) for SR-IOV configuration.
  For example:
  Copy
  
  Copied!
  
  # echo 0000:af:00.2 > /sys/bus/pci/drivers/vfio-pci/bind # Bind PF to VFIO driver # echo 1 > /sys/module/vfio_pci/parameters/enable_sriov # echo <N> > /sys/bus/pci/drivers/vfio-pci/0000:af:00.2/sriov_numvfs # Create VFs device for it

Recovery

The recovery feature enables the restoration of controller state after the SNAP application terminates—either gracefully or unexpectedly (e.g., due to kill -9).

Note

Recovery is only possible if the SNAP application is restarted with the exact same configuration that was active prior to the shutdown or crash.

Note

NVIDIA's default live_update.py script officially supports NVMe-oF/RDMA, Null, Malloc, and delay vbdev. The script is designed to be extensible, allowing users to add support for other devices if needed.

NVMe Recovery

NVMe recovery enables the restoration of an NVMe controller after a SNAP application terminates, whether gracefully or due to a crash (e.g., kill -9).

To perform NVMe recovery:

Re-create the controller in a suspended state using the exact same configuration as before the crash (including the same bdevs, number of queues, namespaces, and namespace UUIDs).
Resume the controller only after all namespaces have been attached.

The recovery process uses shared memory files located under /dev/shm on the BlueField to restore the controller's internal state. These files are deleted when the BlueField is reset, meaning recovery is not supported after a BF reset.

Virtio-blk Crash Recovery

To use virtio-blk recovery, the controller must be re-created with the same configuration as before the crash (i.e. the same bdevs, num queues, etc).

The following options are available to enable virtio-blk crash recovery.

Virtio-blk Crash Recovery with --force_in_order

For virtio-blk crash recovery with --force_in_order, disable the VBLK_RECOVERY_SHM environment variable and create a controller with the --force_in_order argument.

In virtio-blk SNAP, the application is not guaranteed to recover correctly after a sudden crash (e.g., kill -9).

To enable the virtio-blk crash recovery, set the following:

Copy
Copied!

            
            snap_rpc.py virtio_blk_controller_create --force_in_order …

Note

Setting force_in_order to 1 may impact virtio-blk performance as it will serve the command in-order.

Note

If --force_in_order is not used, any failure or unexpected teardown in SNAP or the driver may result in anomalous behavior because of limited support in the Linux kernel virtio-blk driver.

Virtio-blk Crash Recovery without --force_in_order

For virtio-blk crash recovery without --force_in_order, enable the VBLK_RECOVERY_SHM environment variable and create a controller without the --force_in_order argument.

Virtio-blk recovery allows the virtio-blk controller to be recovered after a SNAP application is closed whether gracefully or after a crash (e.g., kill -9).

To use virtio-blk recovery without --force_in_order flag. VBLK_RECOVERY_SHM must be enabled, the controller must be recreated with the same configuration as before the crash (i.e., same bdevs, num queues, etc).

When VBLK_RECOVERY_SHM is enabled, virtio-blk recovery uses files on the BlueField under /dev/shm to recover the internal state of the controller. Shared memory files are deleted when the BlueField is reset. For this reason, recovery is not supported after BlueField reset.

SNAP Configuration Recovery

SNAP can store its configuration as defined by user RPCs and, upon restart, reload it from a configuration JSON file. This mechanism is intended for recovering a previously configured SNAP state - it cannot be used for the initial configuration.

Usage:

Set the environment variable SNAP_RPC_INIT_CONF_JSON to the directory path where the configuration file will be stored.
The configuration file, snap_config.json, is created in this directory after all changes in your script have been successfully applied.
If a new configuration (different from the pre-shutdown configuration) is required after restarting SNAP, delete the existing snap_config.json file before applying the new settings.

When this method is used, there is no need to re-run snap RPCs or set RPCs in init files after the initial configuration — SNAP will automatically load the saved configuration from the SNAP_RPC_INIT_CONF_JSON path. This approach is recommended for fast recovery.

Warning

SNAP Configuration Recovery doesn't support controller modifications, i.e., using SNAP Configuration Recovery after controller_modify RPC may cause unexpected behaviour.
When modifying controller or function configurations, ensure the controller/function is not bounded to any driver until the configuration process is complete. If the change is interrupted, recovery may fail.
Hotplugged emulation functions persist between SNAP runs (but not across BlueField resets) and should be set only once during initial configuration. Only controllers created on these functions are stored in the saved configuration state.
If crash recovery after a reboot is supported, store the file inside the container at /etc/nvda_snap. For unsupported use cases, store it in a temporary location such as /tmp/ or /dev/shm.

Improving SNAP Recovery Time

The following table outlines features designed to accelerate SNAP initialization and recovery processes following termination.

Feature	Description	How to?
SPDK JSON-RPC configuration file	An initial configuration can be specified for the SPDK configuration in SNAP. The configuration file is a JSON file containing all the SPDK JSON-RPC method invocations necessary for the desired configuration. Moving from posting RPCs to JSON file improves bring-up time. Info For more information check SPDK JSON-RPC documentation.	To generate a JSON-RPC file based on the current configuration, run: Copy Copied! `spdk_rpc.py save_config > config.json` The `config.json` file can then be passed to a new SNAP deployment using the environment variable in the YAML `SPDK_RPC_INIT_CONF_JSON`. Note If SPDK encounters an error while processing the JSON configuration file, the initialization phase fails, causing SNAP to exit with an error code.
Disable SPDK accel functionality	The SPDK accel functionality is necessary when using NVMe TCP features. If NVMe TCP is not used, accel should be manually disabled to reduce the SPDK startup time, which can otherwise take few seconds. To disable all accel functionality edit the flags `disable_signature`, `disable_crypto`, and `enable_module`.	Edit the config file as follows: Copy Copied! `{ "method": "mlx5_scan_accel_module", "params": { "qp_size": 64, "cq_size": 1024, "num_requests": 2048, "enable_driver": false, "split_mb_blocks": 0, "siglast": false, "qp_per_domain": false, "disable_signature": true, "disable_crypto": true, "enable_module": false }`
Provide the emulation manager name	If the `SNAP_EMULATION_MANAGER` environment variable is not defined (default), SNAP searches through all available devices to find the emulation manager which may slow down initialization process. Explicitly defining the device reduces the chance of initialization delays.	Use `SNAP_EMULATION_MANAGER` to modify the the variable on the YAML. Refer to SNAP-4 Service Environment Variables for more information.
SNAP configuration recovery	SNAP configuration recovery enables restoring the SNAP state without the need to re-post SNAP RPCs. By moving from posting individual RPCs to using a pre-saved JSON configuration file, the bring-up time is significantly improved.	Set `SNAP_RPC_INIT_CONF_JSON` to a path where config file should be saved. Also use an SPDK JSON-RPC configuration file.
Hugepages allocation	SNAP allocates a mempool from hugepages. Reducing its size can impact the duration of SNAP’s crash recovery.	SNAP_MEMPOOL_SIZE_MB is set to1024MB by default.

Watchdog and Heartbeat Monitoring

The Watchdog and Heartbeat Monitoring feature is an experimental reliability mechanism designed to enhance the robustness of the system by automatically detecting and recovering from application hangs, crashes, or unresponsive components. This mechanism minimizes service disruption by triggering recovery procedures without requiring manual intervention.

The heartbeat system functions as a periodic signal emitted by the SNAP service to indicate its operational status. These signals serve as an indicator that the service is active and functioning as expected.

A dedicated watchdog component monitors the presence and frequency of heartbeat signals. If the heartbeat is not received within a predefined timeout interval, the watchdog determines that the monitored component is unresponsive and initiates a predefined recovery action.

The typical sequence of operations is as follows:

A SNAP component becomes unresponsive due to a crash, hang, or other failure condition.
The watchdog detects the absence of the expected heartbeat signal.
A recovery action is automatically triggered.
The SNAP service is restarted, and previously configured virtual disk states are restored.
Normal operation resumes.

The entire recovery process is designed to complete within a few seconds, thereby minimizing downtime.

Configuring the Watchdog

The behavior of the Watchdog and Heartbeat Monitoring system is configurable through environment variables. These variables allow the user to specify parameters such as heartbeat intervals, timeouts, and recovery policies without requiring changes to the application code.

Environment Variable	Impact on Recovery	Default Value
`SNAP_HEARTBEAT_INTERVAL_MS`	Interval (in milliseconds) between heartbeat signals from SNAP	`0` (disabled)
`SNAP_HEARTBEAT_THREAD_ID`	ID of the thread responsible for processing heartbeat signals	`1`

Info

For more configuration options, check the snap_watchdog.py script.

Running the Watchdog

To initiate the watchdog service while the SNAP application is running, execute the following command:

Copy
Copied!

            
            ./snap_watchdog.py --daemon

This command launches the watchdog in the background, where it continuously monitors the health of the SNAP service and initiates recovery procedures as necessary.

Note

snap_watchdog.py requires using python 3.7 or above.

If SNAP is running in a BFB that only contains an older version of python (i.e. Anolis 8), user must also run the command:

Copy
Copied!

            
            pip install dataclasses

IO/Core Multiplexer

The I/O Core Multiplexer (MP) is a configurable mechanism that determines how I/O requests from a single source are distributed across the available DPU cores. This feature is critical for optimizing performance based on application-specific needs, particularly in scenarios involving high I/O workloads.

The multiplexer offers two policy modes:

None (Default) – All I/O operations originating from a single source are processed by a single DPU core. I/O sources are distributed across DPU cores in a balanced manner.
- Recommended for: Low-latency environments
- Optimization focus: I/O latency
(Weighted) Round Robin – I/O requests from a single source are distributed across multiple DPU cores in a round-robin sequence. If the backend supports per-core weight configuration (e.g., SPDK NVMe-oF bdev), the distribution follows those weights. Otherwise, the I/Os are spread evenly.
- Recommended for: Bandwidth-intensive environments or systems with low per-core backend throughput (e.g., TCP-based backends)
- Optimization focus: I/O bandwidth

To configure the IO/Core Multiplexer policy, users need to set the IO_CORE_DISTRIBUTION_POLICY environment variable. The available options are:

none – Refers to the default policy where all I/Os from a single source are handled by a single DPU core
weighted_rr – Refers to the (Weighted) Round Robin policy, distributing I/Os across multiple cores

Note

weighted_rr policy is not supported for virtio-blk.

DPA Core Mask

The Data Path Accelerator (DPA) is an auxiliary processor designed to accelerate data-path operations. It consists of a cluster of 16 cores, each with 16 execution units (EUs), totaling 171 EUs (0–170).

SNAP runs DPA applications to accelerate NVMe and Virtio-blk protocols.

Note

Using the YAML-Based DPA resource management should be the default tool to control DPA EUs.

For the default configuration for the DPA resource configuration refer directly to DPA Resource Management Default Configuration.

If other DPA applications (e.g., Virtio-net) are running concurrently, you must configure a DPA resource YAML file to avoid conflicts.

For more details, see Single Point of Resource Distribution.

By default, all EUs (0–170) are shared between NVMe, Virtio-blk, and other DPA applications on the system (e.g., Virtio-net).

Note

There is a hardware limit of 128 queues (threads) per DPA EU.

DPA Resource Management with YAML File

YAML-based DPA resource management provides a centralized and consistent method for allocating EUs across applications.

Requirements:

Application names in the YAML must match SNAP’s DPA applications:
- dpa_helper – Used for Virtio-blk and NVMe (DPU mode)
- dpa_virtq_split – Used for Virtio-blk (DPA mode)
- dpa_nvme – Used for NVMe (DPA mode)
For DPU mode (dpa_helper), the number of instances should match the number of ARM cores used by SNAP and the number of EUs allocated.
At least 1 EU must be allocated per application instance.
Valid EU range is 0–170.
EU allocation must not overlap across different applications.
EU groups are not supported.
SNAP DPA applications must run on the ROOT partition. EUs set in other partitions are not usable.

If running multiple SNAP containers, set the environment variable SNAP_DPA_INSTANCE_ID_ENV to a unique ID in each container.

Update the YAML with the corresponding instance names in the format <APP_NAME>_<ID>.

Example for two virtio-blk DPU containers:

YAML input multiple SNAP instances

Copy
Copied!

            
            ---
version: 25.04
---
DPA_APPS:
  dpa_helper_1:
    - partition: ROOT
      affinity_EUs: [1-16]
  dpa_helper_2:
    - partition: ROOT
      affinity_EUs: [17-32]

DPA Resource Management Default Configuration

YAML input format for SNAP

Copy
Copied!

            
            ---
version: 25.04
---
DPA_APPS:
  dpa_helper:
    - partition: ROOT
      affinity_EUs: [1-16]
  dpa_virtq_split:
    - partition: ROOT
      affinity_EUs: [0-169]
  dpa_nvme:
    - partition: ROOT
      affinity_EUs: [0-169]

To generate the output YAML:

dpa-resource-mgmt

Copy
Copied!

            
            dpa-resource-mgmt config -d mlx5_0 -f ~/DPA_RESOURCE_INPUT.yaml

Then, set the environment variable to point to the generated YAML file:

dpa-resource-mgmt

Copy
Copied!

            
            export SNAP_DPA_YAML_PATH=~/ROOT.YAML

Note

If using containers, ensure the YAML path is mounted into the container (e.g., /etc/nvda_snap).

Note

Do not manually edit the YAML file generated by the dpa-resource-mgmt tool.

Note

Only EUs 0-170 are available for SNAP.

Core Mask Method

Note

When using SNAP 4 concurrently with SNAP Virtio-fs service the core mask environment variable is the default way to configure the DPA EUs and is already preconfigured in SNAP Virtio-fs.

If any other changes in the DPA EUs is needed, please use the DPA resource management with YAML file method.

To assign specific set of EUs, set the following environment variable:

For virtio-blk and NVMe DPU mode:

Copy
Copied!

            
            dpa_helper_core_mask=0x<EU_mask>

For NVMe DPA mode:

Copy
Copied!

            
            dpa_nvme_core_mask=0x<EU_mask>

For virtio-blk DPA mode:

Copy
Copied!

            
            dpa_virtq_split_core_mask=0x<EU_mask>

The core mask must contain valid hexadecimal digits (it is parsed right to left). For example, dpa_virtq_split_core_mask=0xff00 sets 8 EUs (i.e., EUs 8-16).

SNAP ML Optimizer

The SNAP ML Optimizer is a performance-tuning utility that dynamically adjusts polling parameters within the SNAP I/O subsystem. It is designed to improve controller throughput by identifying the optimal configuration based on current hardware, workload patterns, and system constraints.

How it works:

During runtime, the optimizer iteratively modifies internal configuration parameters (referred to as "actions").
After each configuration change, it measures the resulting system performance (referred to as the "reward").
Using predictive modeling, the optimizer determines the most promising configuration to evaluate next, allowing it to converge on an optimal setup efficiently.

This approach eliminates the need to exhaustively test all possible combinations, significantly reducing tuning time while ensuring performance gains.

Info

Currently, the tool supports "IOPS" as the reward metric, which it aims to maximize.

SNAP ML Optimizer Online

The ML Optimizer will continuously apply actions in the background, relevant to current active workloads.

Workload profiles will be learnt and adapted to quickly as they reappear. Once a workload has been studied and a set of parameters fit, the optimizer will go to sleep until a significant change is detected.

Setting Up SNAP ML Online Optimizer:

ML Optimizer can be enabled via environment varriable SNAP_ML_OPTIMIZER_ENABLED=1 when launching SNAP

ML Optimizer may also be enabled via the snap_ml_optimizer_create rpc at runtime.

SNAP ML Optimizer Online Related RPCs

snap_ml_optimizer_create

The snap_ml_optimizer_create command is used to create/recreate ML Optimizer, which will immediately start optimizing the system.

Note

ML Optimizer will likely override any parameters applied by snap_actions_set or selected by default via environment varriables or otherwise.

No parameters exist for this command.

snap_ml_optimizer_destroy

The snap_ml_optimizer_destroy command is used to stop ML Optimizer, after the optimizer is stopped, the default parameters for SNAP will be restored.

All data collected by optimizer will be destroyed and any future optimizer will start over.

No parameters exist for this command.

snap_ml_optimizer_is_running

The snap_ml_optimizer_is_running command is used to check the status of ML Optimizer

No parameters exist for this command.

snap_ml_optimizer_current_parameters

The snap_ml_optimizer_current_parameters command is used to retrieve the currently applied internal parameters, whether they are applied by online optimizer were otherwise manully applied.

No parameters exist for this command.

SNAP ML Optimizer Offline (DEPRECATED)

SNAP ML Optimizer Preparation Steps

Machine Requirements

The device should be able to SSH to the BlueField:

Python 3.10 or above
At least 6 GB of free storage

Setting Up SNAP ML Optimizer

To set up the SNAP ML optimizer:

Copy the snap_ml folder from the container to the shared nvda_snap folder and then to the requested machine:

Copy
Copied!

            
            crictl exec -it $(crictl ps -s running -q --name snap) cp -r /opt/nvidia/nvda_snap/bin/snap_ml /etc/nvda_snap/

Change directory to the snap_ml folder:

Copy
Copied!

            
            cd tools/snap_ml

Create a virtual environment for the SNAP ML optimizer.
Copy

Copied!
```
            
            python3 -m venv snap_ml
        
```
This ensures that the required dependencies are installed in an isolated environment.

Activate the virtual environment to start working within this isolated environment:

Copy
Copied!

            
            source snap_ml/bin/activate

Install the Python package requirements:

Copy
Copied!

            
            pip3 install --no-cache-dir -r requirements.txt

This may take some time depending on your system's performance.

Run the SNAP ML Optimizer.

Copy
Copied!

            
            python3 snap_ml.py --help

Use the --help flag to see the available options and usage information:

Copy
Copied!

            
            --version Show the version and exit. 
-f, --framework <TEXT> Name of framework (Recommended: ax , supported: ax, pybo). 
-t, --total-trials <INTEGER> Number of optimization iterations. The recommended range is 25-60. 
--filename <TEXT> where to save the results (default: last_opt.json). 
--remote <TEXT> connect remotely to the BlueField card, format: <bf_name>:<username>:<password> 
--snap-rpc-path <TEXT> Snap RPC prefix (default: container path). 
--log-level <TEXT> CRITICAL | ERROR | WARN | WARNING | INFO | DEBUG 
--log-dir <TEXT> where to save the logs.

SNAP ML Optimizer Related RPCs

snap_actions_set

The snap_actions_set command is used to dynamically adjust SNAP parameters (known as "actions") that control polling behavior. This command is a core feature of SNAP-AI tools, enabling both automated optimization for specific environments and workloads, as well as manual adjustment of polling parameters.

Command parameters:

Parameter	Mandatory?	Type	Description
`poll_size`	No	Number	Maximum number of IOs SNAP passes in a single polling cycle (integer; 1-256)
`poll_ratio`	No	Number	The rate in which SNAP poll cycles occur (float; 0<`poll_ratio`≤1)
`max_inflights`	No	Number	Maximum number of in-flight IOs per core (integer; 1-65535)
`max_iog_batch`	No	Number	Maximum fairness batch size (integer; 1-4096)
`max_new_ios`	No	Number	Maximum number of new IOs to handle in a single poll cycle (integer; 1-4096)

Note

snap_actions_set cannot be used when ML Optimizer Online is enabled

snap_reward_get

The snap_reward_get command retrieves performance counters, specifically completion counters (or "reward"), which are used by the optimizer to monitor and enhance SNAP performance.

No parameters are required for this command.

Run the ML Optimizer

To optimize SNAP’s parameters for your environment, use the following command:

Copy
Copied!

            
            python3 snap_ml.py --framework ax --total-trials 40 --filename example.json --remote <bf_hostname>:<username>:<password> --log-dir <log_directory>

Results and Post-optimization Actions

Once the optimization process is complete, the tool automatically applies the optimized parameters. These parameters are also saved in a example.json file in the following format:

Copy
Copied!

            
            { 
    "poll_size": 30, 
    "poll_ratio": 0.6847347955107689, 
    "max_inflights": 32768, 
    "max_iog_batch": 512, 
    "max_new_ios": 32 
}

Additionally, the tool documents all iterations, including the actions taken and the rewards received, in a timestamped file named example_<timestamp>.json.

Applying Optimized Parameters Manually

Users can apply the optimized parameters on fresh instances of SNAP service by explicitly calling the snap_actions_set RPC with the optimized parameters as follows:

Copy
Copied!

            
            snap_rpc.py snap_actions_set –poll_size 30 –poll_ratio 0.6847 --max_inflights 32768 –max_iog_batch 512 –max_new_ios 32

Note

It is only recommended to use the optimized parameters if the system is expected to behave similarly to the system on which the SNAP ML optimizer is used.

Deactivating Python Environment

Once users are done using the SNAP ML Optimizer, they can deactivate the Python virtual environment by running:

Copy
Copied!

            
            deactivate

Plugins

Plugins are modular components or add-ons that enhance the functionality of the SNAP application. They integrate seamlessly with the main software, allowing additional features without requiring changes to the core codebase. Plugins are designed for use only with the source package, as it allows customization during the build process, such as enabling or disabling plugins as needed.

In containerized environments, the SNAP application is shipped as a pre-built binary with a fixed configuration. Since the binary in the container is precompiled, adding or removing plugins is not possible. The containerized software only supports the plugins included during its build. For environments requiring plugin flexibility, such as adding custom plugins, the source package must be used.

To build a SNAP source package with a plugin, perform the following instead of following the basic build steps :

Move to the sources folder. Run:

Copy
Copied!

            
            cd /opt/nvidia/nvda_snap/src/

Build the sources with plugin to be enabled. Run:

Copy
Copied!

            
            meson setup /tmp/build -Denable-bdev-null=true -Denable-bdev-malloc=true

Compile the sources. Run:

Copy
Copied!

            
            meson compile -C /tmp/build

Install the sources. Run:

Copy
Copied!

            
            meson install -C /tmp/build

Configure the SNAP environment variables and run SNAP service as explained in sections "Configure SNAP Environment Variables" and "Run SNAP Service".

Bdev

SNAP supports various types of block devices (bdev), offering flexibility and extensibility in interacting with storage backends. These bdev plugins provide different storage emulation options, allowing customization without requiring modifications to the core software.

SPDK

SPDK is the default plugin used by SNAP. If no specific plugin is explicitly specified, SNAP will default to using SPDK for block device operations.
For more information, refer to spdk_bdev.

Malloc

The Malloc plugin is intended for performance analysis and debugging purposes only; it is not suitable for production use
It creates a memory-backed block device by allocating a buffer in memory and exposing it as a block device
Since data is stored in memory, it is lost when the system shuts down
This plugin can be enabled using the enable-bdev-malloc build option

Malloc configuration example:

Create Malloc bdev and use it with an NVMe controller:

Copy
Copied!

            
            # snap_rpc.py snap_bdev_malloc_create --bdev test 64 512
# snap_rpc.py nvme_subsystem_create -s nqn.2020-12.mlnx.snap
# snap_rpc.py nvme_namespace_create -s nqn.2020-12.mlnx.snap -t malloc -b test -n 1
# snap_rpc.py nvme_controller_create --pf_id=0 -s nqn.2020-12.mlnx.snap --mdts=7
# snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl1 -n 1

Delete Malloc bdev:

Copy
Copied!

            
            # snap_rpc.py snap_bdev_malloc_destroy test

Resize Malloc bdev:
Copy

Copied!
```
            
            # snap_rpc.py snap_bdev_malloc_resize test 32
        
```
This removes the existing bdev and creates a new one with the specified size. Data on the existing bdev will be lost.

NULL

The NULL plugin is designed for performance analysis and debugging purposes and is not intended for production use.
It acts as a dummy block device, accepting I/O requests and emulating a block device without performing actual I/O operations.
It is useful for testing or benchmarking scenarios that do not involve real storage devices.
The plugin consumes minimal system resources.
It can be enabled using the enable-bdev-null build option.

NULL configuration example:

Create a NULL bdev and use it with an NVMe controller:

Copy
Copied!

            
            # snap_rpc.py snap_bdev_null_create_dbg test 1 512
# snap_rpc.py nvme_subsystem_create -s nqn.2020-12.mlnx.snap
# snap_rpc.py nvme_namespace_create -s nqn.2020-12.mlnx.snap -t malloc -b test -n 1
# snap_rpc.py nvme_controller_create --pf_id=0 -s nqn.2020-12.mlnx.snap --mdts=7
# snap_rpc.py nvme_controller_attach_ns -c NVMeCtrl1 -n 1

Delete the NULL bdev:

Copy
Copied!

            
            # snap_rpc.py snap_bdev_null_destroy_dbg test

On This Page

run_snap.sh

set_environment_variables.sh

NMVe exmaple:

Virtio-blk Example

NMVe exmaple:

Virtio-blk Example

NMVe exmaple:

Virtio-blk Example

YAML input multiple SNAP instances

YAML input format for SNAP

dpa-resource-mgmt

dpa-resource-mgmt