Live Migration

NVIDIA BlueField Virtio-net v1.9.0

Virtio VF PCIe devices can be attached to the guest VM using vhost acceleration software stack. This enables performing live migration of guest VMs.

virtio-vf-pcie-devices-for-vhost-acceleration-version-1-modificationdate-1714055337747-api-v2.png

This section describes the steps to enable VM live migration using virtio VF PCIe devices along with vhost acceleration software.

vdpa-over-virtio-full-emulation-design-version-1-modificationdate-1714055340287-api-v2.png

Virtio VF PCIe devices can be attached to the guest VM using the vhost acceleration software stack. This enables performing live migration of guest VMs.

virtio-vf-pcie-devices-for-vhost-acceleration-version-1-modificationdate-1718747084749-api-v2.png

This section provides the steps to enable VM live migration using virtio VF PCIe devices along with vhost acceleration software.

vdpa-over-virtio-full-emulation-design-version-1-modificationdate-1718747081488-api-v2.png

Prerequisites

  • Minimum hypervisor kernel version – Linux kernel 5.7 (for VFIO SR-IOV support)

  • To use high-availability (the additional vfe-vhostd-ha service which can persist datapath when vfe-vhostd crashes), this kernel patch must be applied.

Install vHost Acceleration Software Stack

Vhost acceleration software stack is built using open-source BSD licensed DPDK.

  • To install vhost acceleration software:

    1. Clone the software source code:

      Copy
      Copied!
                  

      [host]# git clone https://github.com/Mellanox/dpdk-vhost-vfe

      Info

      The latest release tag is vfe-1.2.

    2. Build software:

      Copy
      Copied!
                  

      [host]# apt-get install libev-dev [host]# yum install -y numactl-devel libev-devel [host]# meson build --debug -Denable_drivers=vdpa/virtio,common/virtio,common/virtio_mi,common/virtio_ha   [host]# ninja -C build install

  • To install QEMU:

    Info

    Upstream QEMU later than 8.1 can be used or the following NVIDIA QEMU.

    1. Clone NVIDIA QEMU sources.

      Copy
      Copied!
                  

      [host]# git clone https://github.com/Mellanox/qemu -b stable-8.1-presetup

      Info

      Latest release tag is vfe-0.6.

    2. Build NVIDIA QEMU.

      Copy
      Copied!
                  

      [host]# mkdir bin [host]# cd bin [host]# ../configure --target-list=x86_64-softmmu --enable-kvm [host]# make -j24

Configure vHost and DPU System

  1. Configure BlueField for virtio-net. Please refer to Virtio-net Deployment for reference of mlxconfig.

  2. Set up the hypervisor system:

    1. Configure hugepages and libvirt VM XML. See OVS-Kernel Hardware Offloads for information on doing that.

    2. Enable qemu:commandline in VM XML by adding the xmlns:qemu option:

      Copy
      Copied!
                  

      <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>

    3. Enable 1GB hugepage in VM XML by adding memoryBacking and numa sections:

      Copy
      Copied!
                  

      <currentMemory unit='KiB'>8388608</currentMemory> <memoryBacking> <hugepages> <page size='1048576' unit='KiB'/> </hugepages> </memoryBacking> <vcpu placement='static'>16</vcpu> <!-- ...... --> <cpu mode='host-passthrough' check='none' migratable='on'> <numa> <cell id='0' cpus='0-15' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu>

    4. Add a virtio-net interface in VM XML.

      Copy
      Copied!
                  

      <qemu:commandline> <qemu:arg value='-chardev'/> <qemu:arg value='socket,id=char0,path=/tmp/vhost-net0,server=on'/> <qemu:arg value='-netdev'/> <qemu:arg value='type=vhost-user,id=vdpa,chardev=char0,queues=4'/> <qemu:arg value='-device'/> <qemu:arg value='virtio-net-pci,netdev=vdpa,mac=00:00:00:00:33:00,vectors=10,page-per-vq=on,rx_queue_size=1024,tx_queue_size=1024,mq=on,bus=pci.0,addr=0x9'/> </qemu:commandline>

Run vHost Acceleration Service

  1. Bind the virtio PF devices to the vfio-pci driver:

    Copy
    Copied!
                

    [host]# modprobe vfio vfio_pci [host]# echo 1 > /sys/module/vfio_pci/parameters/enable_sriov   [host]# echo 0x1af4 0x1041 > /sys/bus/pci/drivers/vfio-pci/new_id [host]# echo 0x1af4 0x1042 > /sys/bus/pci/drivers/vfio-pci/new_id   [host]# echo <pf_bdf> > /sys/bus/pci/drivers/vfio-pci/bind [host]# echo <vf_bdf> > /sys/bus/pci/drivers/vfio-pci/bind   [host]# lspci -vvv -s <pf_bdf> | grep "Kernel driver" Kernel driver in use: vfio-pci [host]# lspci -vvv -s <vf_bdf> | grep "Kernel driver" Kernel driver in use: vfio-pci

    Info

    Example of <pf_bdf> or <vf_bdf> format: 0000:af:00.3

  2. Enable SR-IOV and create a VF(s):

    Copy
    Copied!
                

    [host]# echo 1 > /sys/bus/pci/devices/<pf_bdf>/sriov_numvfs [host]# echo 1 > /sys/bus/pci/devices/<vf_bdf>/sriov_numvfs   [host]# lspci | grep Virtio <vf_bdf> 0000:af:00.1 Ethernet controller: Red Hat, Inc. Virtio network device 0000:af:00.3 Ethernet controller: Red Hat, Inc. Virtio network device

  3. Add a VF representor to the OVS bridge on the BlueField:

    Copy
    Copied!
                

    [DPU]# virtnet query -p 0 -v 0 | grep sf_rep_net_device "sf_rep_net_device": "en3f0pf0sf3000", [DPU]# ovs-vsctl add-port ovsbr1 en3f0pf0sf3000

  4. Run the vhost acceleration software service:

    Copy
    Copied!
                

    [host]# cd dpdk-vhost-vfe [host]# sudo ./build/app/dpdk-vfe-vdpa -a 0000:00:00.0 --log-level=.,8 --vfio-vf-token=cdc786f0-59d4-41d9-b554-fed36ff5e89f -- --client

    Or start the vfe-vhostd service:

    Copy
    Copied!
                

    [host]# systemctl start vfe-vhostd

    Info

    A log of the service can be viewed by running the following:

    Copy
    Copied!
                

    [host]# journalctl -u vfe-vhostd

  5. Provision the virtio-net PF and VF:

    Copy
    Copied!
                

    [host]# cd dpdk-vhost-vfe   [host]# python ./app/vfe-vdpa/vhostmgmt mgmtpf -a <pf_bdf> # Wait on virtio-net-controller finishing handle PF FLR   # On DPU, change VF MAC address or other device options [DPU]# virtnet modify -p 0 -v 0 device -m 00:00:00:00:33:00   # Add VF into vfe-dpdk [host]# python ./app/vfe-vdpa/vhostmgmt vf -a 0000:af:04.5 -v /tmp/vhost-net0

    Note

    If the SR-IOV is disabled and reenabled, the user must re-provision the VFs. 00:00:00:00:33:00 is a virtual MAC address used in VM XML.

Start the VM

Copy
Copied!
            

[host]# virsh start <vm_name>


HA Service

Running the vfe-vhostd-ha service allows the datapath to persist should vfe-vhostd crash:

Copy
Copied!
            

[host]# systemctl start vfe-vhostd-ha


Simple Live Migration

  1. Prepare two identical hosts and perform the provisioning of the virtio device to DPDK on both.

  2. Boot the VM on one server:

    Copy
    Copied!
                

    [host]# virsh migrate --verbose --live --persistent <vm_name> qemu+ssh://<dest_node_ip_addr>/system --unsafe

Remove Device

When finished with the virtio devices, use following commands to remove them from DPDK:

Copy
Copied!
            

[host]# python ./app/vfe-vdpa/vhostmgmt vf -r <vf_bdf> [host]# python ./app/vfe-vdpa/vhostmgmt mgmtpf -r <pf_bdf>


© Copyright 2024, NVIDIA. Last updated on Jun 18, 2024.