High Availability
High availability (HA) is essential in network infrastructure to ensure continuous performance with minimal downtime, even during failures.
To support HA, the virtio-net-controller process creates the auxiliary processes virtio-net-emu and virtio-net-ha. The virtio-net-emu process handles primary controller functions, while virtio-net-ha manages HA. virtio-net-ha saves and oversees critical resources from virtio-net-emu and restores it to a working state if a failure occurs. The two processes communicate through IPC messages.
High availability is only supported on BlueField-3 and after.
The following table provides possible expected behaviors:
|
Scenarios |
Behavior |
Downtime Per Device (sec) |
Fallback Action |
|
Virtio-net-emu process crashes (e.g., Segfault) |
The |
< 1 |
The |
|
Device/VQ/SF create/destroy failures |
HA makes sure the existing device is not affected |
N/A |
Retry or restart service |
|
DPA command timeout |
No action from HA; DPA is likely stuck |
N/A |
The |