Recovery

NVIDIA BlueField Virtio-net v1.9.0

Recovery is critical for status restoration (both control plane and data plane) for cases such as controller restart, live update, or live migration.

Recovery depends on the JSON files stored in /opt/mellanox/mlnx_virtnet/recovery where there is a file corresponding to each device (either PF or VF). The filename is the unique VUID of the corresponding device.

The following entries are saved to the recovery file and restored when necessary:

Entry

Type

Description

port_ib_dev

String

RDMA device name the virtio-net device is created on

pf_id

Number

ID of PF

vf_id

Number

ID of VF, valid for VF only

function_type

String

PF or VF

bdf_raw

Number

Virtio-net device bus:device:function in uint16 type

device_type

String

Static or hotplug (only for PF)

mac

String

MAC address of device

pf_num

Number

PCIe function number

sf_num

Number

SF number which was used for this virtio-net device

mq

Number

Number of multi-queue created for this virtio-net device

An example of recovery file for a hotplug PF device:

Copy
Copied!
            

{ "port_ib_dev": "mlx5_0", "pf_id": 0, "function_type": "pf", "bdf_raw": 57611, "device_type": "hotplug", "mac": "0c:c4:7a:ff:22:93", "pf_num": 0, "sf_num": 2000, "mq": 3 }

Depending on the actions of the BlueField or host, recovery may or may not be performed. Please refer to the following table for individual scenarios:

DPU Actions

Host Actions

Restart Controller

Live Update

Hot Unplug

Destroy VFs

Unload Driver

Power Cycle Host & DPU

Warm Reboot

Live Migration

Static PF

Recover

Recover

N/A

N/A

Recover

No recover

Recover

Recover

Hotplug PF

Recover

Recover

No recover

N/A

Recover

No recover

Recover

Recover

VF

Recover

Recover

N/A

Recovery file deleted

No Recover

No recover

No recover

Recover

Note

These recovery files are internal to the controller and should not be modified.

Note

Controller recovery is enabled by default and does not need user configuration or intervention. When the mlxconfig settings used by the controller take effect, the newly started controller service automatically deletes all recovery files.

© Copyright 2024, NVIDIA. Last updated on Jun 18, 2024.