NVIDIA BlueField-2 SNAP for NVMe and Virtio-blk v3.8.0-5
NVIDIA BlueField-2 SNAP for NVMe and Virtio-blk v3.8.0-5

Appendix – Live Upgrade

Warning

The following procedure is designed for live deployment of small software bug fixes or modifications made in the SNAP application. Using this procedure for other purposes (e.g., bumping SNAP service to a new version on top of an older BFB image) may cause SNAP to malfunction.

To live upgrade SNAP, 2 SNAP processes must be opened in parallel.

Note

All system resources (e.g., hugepages, memory) must be sufficient to temporarily support 2 SNAP application instances operating in parallel during the upgrade procedure.

  1. Open 2 SNAP processes simultaneously on the Arm.

    Note

    This requires changing the SPDK RPC server path.

    Info

    For lower downtime, it is highly recommended to run each process on a different CPU mask.

    For SNAP Process 1, run:

    Copy
    Copied!
                

    ./mlnx_snap_emu -m 0xf0 -r /var/tmp/spdk.sock1

    For SNAP Process 2, run:

    Copy
    Copied!
                

    ./mlnx_snap_emu -m 0x0f -r /var/tmp/spdk.sock2

  2. Connect to the same bdev with both processes (i.e., with Malloc device).

    For SNAP Process 1, run:

    Copy
    Copied!
                

    spdk_rpc.py -s /var/tmp/spdk.sock1 bdev_malloc_create -b Malloc1 1024 512

    For SNAP Process 2, run:

    Copy
    Copied!
                

    spdk_rpc.py -s /var/tmp/spdk.sock2 bdev_malloc_create -b Malloc1 1024 512

  3. Open a virtio-blk controller on the SNAP Process 1:

    Copy
    Copied!
                

    snap_rpc.py -s /var/tmp/spdk.sock1 controller_virtio_blk_create mlx5_0 --pf_id 0 --bdev_type spdk --bdev Malloc1 --num_queues 16

  4. Load virtio-blk driver on the host side and start using it.

  5. Delete the virtio-blk controller instance from SNAP Process 1 and immediately open a virtio-blk controller on SNAP Process 2:

    Copy
    Copied!
                

    snap_rpc.py -s /var/tmp/spdk.sock1 controller_virtio_blk_delete VblkEmu0pf0 --force &&  snap_rpc.py -s /var/tmp/spdk.sock2 controller_virtio_blk_create mlx5_0 --pf_id 0 --bdev_type spdk --bdev Malloc1  --num_queues 16

Assuming there exists a fully configured SNAP service is already running on the system:

  1. Create a local copy of SNAP binary file (e.g., under /tmp folder):

    Copy
    Copied!
                

    cp /usr/bin/mlnx_snap_emu /tmp/

  2. For all active virtio-blk controllers, follow management passing procedure as described in section "Passing virtio-blk Controller's Management Between SNAP Processes".

  3. Stop original SNAP service.

    Copy
    Copied!
                

    systemctl stop mlnx_snap

  4. Upgrade SNAP service.

    • If installed from binary, use Linux official installation framework (apt/yum)

    • If installed from sources, follow the same installation process as done originally

  5. Repeat management passing procedure, this time to move back control from the local copy into the official (updated) version of SNAP service.

© Copyright 2024, NVIDIA. Last updated on Aug 14, 2024.