The recovery feature enables the restoration of controller state after the SNAP application terminates—either gracefully or unexpectedly (e.g., due to kill -9 ).

Note Recovery is only possible if the SNAP application is restarted with the exact same configuration that was active prior to the shutdown or crash.

NVMe recovery enables the restoration of an NVMe controller after a SNAP application terminates, whether gracefully or due to a crash (e.g., kill -9 ).

To perform NVMe recovery:

Re-create the controller in a suspended state using the exact same configuration as before the crash (including the same bdevs, number of queues, namespaces, and namespace UUIDs). Resume the controller only after all namespaces have been attached.

The recovery process uses shared memory files located under /dev/shm on the BlueField to restore the controller's internal state. These files are deleted when the BlueField is reset, meaning recovery is not supported after a BF reset.

To use virtio-blk recovery, the controller must be re-created with the same configuration as before the crash (i.e. the same bdevs, num queues, etc).

The following options are available to enable virtio-blk crash recovery.

For virtio-blk crash recovery with --force_in_order , disable the VBLK_RECOVERY_SHM environment variable and create a controller with the --force_in_order argument.

In virtio-blk SNAP, the application is not guaranteed to recover correctly after a sudden crash (e.g., kill -9 ).

To enable the virtio-blk crash recovery, set the following:

Copy Copied! snap_rpc.py virtio_blk_controller_create --force_in_order …

Note Setting force_in_order to 1 may impact virtio-blk performance as it will serve the command in-order.

Note If --force_in_order is not used, any failure or unexpected teardown in SNAP or the driver may result in anomalous behavior because of limited support in the Linux kernel virtio-blk driver.





For virtio-blk crash recovery without --force_in_order , enable the VBLK_RECOVERY_SHM environment variable and create a controller without the --force_in_order argument.

Virtio-blk recovery allows the virtio-blk controller to be recovered after a SNAP application is closed whether gracefully or after a crash (e.g., kill -9 ).

To use virtio-blk recovery without --force_in_order flag. VBLK_RECOVERY_SHM must be enabled, the controller must be recreated with the same configuration as before the crash (i.e., same bdevs, num queues, etc).

When VBLK_RECOVERY_SHM is enabled, virtio-blk recovery uses files on the BlueField under /dev/shm to recover the internal state of the controller. Shared memory files are deleted when the BlueField is reset. For this reason, recovery is not supported after BlueField reset.

SNAP can store its configuration as defined by user RPCs and, upon restart, reload it from a configuration JSON file. This mechanism is intended for recovering a previously configured SNAP state — it cannot be used for the initial configuration.

Usage:

Set the environment variable SNAP_RPC_INIT_CONF_JSON to the directory path where the configuration file will be stored. The configuration file, snap_config.json , is created in this directory after all changes in your script have been successfully applied. If a new configuration (different from the pre-shutdown configuration) is required after restarting SNAP, delete the existing snap_config.json file before applying the new settings.

When this method is used, there is no need to re-run snap RPCs or set RPCs in init files after the initial configuration — SNAP will automatically load the saved configuration from the SNAP_RPC_INIT_CONF_JSON path. This approach is recommended for fast recovery.

Warning When modifying controller or function configurations, ensure the driver remains unloaded until the configuration process is complete. If the change is interrupted, recovery may fail.

Warning Hotplugged emulation functions persist between SNAP runs (but not across BlueField resets) and should be set only once during initial configuration. Only controllers created on these functions are stored in the saved configuration state.

Warning If crash recovery after a reboot is supported, store the file inside the container at /etc/nvda_snap . For unsupported use cases, store it in a temporary location such as /tmp/ or /dev/shm .





The following table outlines features designed to accelerate SNAP initialization and recovery processes following termination.