NVIDIA Messaging Accelerator (VMA) Documentation Rev 9.8.60

Setting Up XLIO Within a Docker Container

  • This guide offers instructions for running NVIDIA Accelerated IO (XLIO) in Docker containers using NVIDIA DOCA Software Framework.

  • NVIDIA NICs utilize Single Root IO Virtualization (SR-IOV), a technology that allows a physical NIC (PF) to present multiple virtual instances (Virtual Functions or VFs).

    • Each VF is a lightweight instance of the PF appearing as a distinct network interface with no additional overhead.

    • PFs are typically used in the host network stack, while VFs are typically used in virtualized environments like Virtual Machines and Docker containers.

  • Check XLIO System Requirements for details on supported NICs and more.

  • Ensure all necessary kernel drivers for NVIDIA hardware, including Mellanox NICs, are installed on the host before deploying containers.

  • If SR-IOV is required, ensure it is enabled on SR-IOV-capable NIC on the host. Refer to Docker Using SR-IOV for more instructions.

Note

Make sure you're using the latest DOCA-Host version.

  1. Pull the desired container image:

    Copy
    Copied!
                

    #(host) docker pull <container-image> # e.g. docker pull ubuntu:22.04

  2. Run the container image:

    Copy
    Copied!
                

    #(host) docker run -it <container-image> /bin/bash

  3. Install the required packages inside the container:

    • For DEB-based distributions (e.g. Ubuntu):

      Copy
      Copied!
                  

      #(container) <your-package-manager> update #(container) <your-package-manager> -y install curl gpg iproute2 iputils-ping

    • For RPM-based distributions (e.g. RH and Oracle Linux):

      Copy
      Copied!
                  

      #(container) <your-package-manager> update #(container) <your-package-manager> -y install iproute iputils

  4. Install libxlio library and its dependencies inside the container. For more information, see Installing XLIO as libxlio Profile.

  5. Install sockperf utility inside the container:

    • Copy
      Copied!
                  

      #(container) <your-package-manager> -y install sockperf

  6. Exit and Save the container image:

    • Exit the container:

      Copy
      Copied!
                  

      #(container) exit

    • Identify the container ID and commit the container to a new image:

      Copy
      Copied!
                  

      #(host) docker ps -a #(host) docker commit -m "Added XLIO" <CONTAINER-ID> xlio-image

This section provides instructions on how to run and configure the newly created xlio-image Docker image, detailing both the required and some of the optional configurations for running the container.

Required Configurations

When running the container with docker run, the following configurations must be specified to ensure proper functionality:

  • --ulimit memlock=-1: This option allows you to set unlimited memory lock for the container. For more details, refer to the ULIMIT Considerations section below.

  • --device=/dev/infiniband: This option grants the container access to available InfiniBand devices.

    Note: Instead of granting access to all available IB devices, you can grant access to specific InfiniBand devices by running:

    Copy
    Copied!
                

    --device=/dev/infiniband/rdma_cm --device=/dev/infiniband/uverbs0

    Note: Make sure to use the correct uverbsX in case you have multiple InfiniBand devices.

  • --cap-add=NET_RAW NET_ADMIN: This capability provides the ability to configure and manage network interfaces, in addition to Raw Socket processing.

Optional Configurations

XLIO Huge Pages Configuration

XLIO can take advantage of Huge Pages to optimize memory allocation and takes full advantage of the performance benefits of Huge Pages.

Huge Pages are allocated by the host. Once configured, the container can access and use them from the host's memory pool.

  • To check the current Huge Page configuration settings:

    Copy
    Copied!
                

    #(host) cat /proc/meminfo | grep HugePages

  • To allocate a sufficient amount of Huge Pages (preferably a total of 2GB memory):

    Copy
    Copied!
                

    #(host) echo <number_of_hugepages> | sudo tee /proc/sys/vm/nr_hugepages

    • To allocate a total of approximately 2GB of huge pages, determine the size of your system's hugepages (usually 2MB) and calculate the required number.

      For example, if your system uses 2MB huge pages, you would need to allocate 1024 huge pages to reach a total of 2GB.

Network Configurations

Network Configuration Option 1: Host Network

Use the host's network stack, which directly connects the container to the host's networking environment:

Run The Container (xlio-image):

Copy
Copied!
            

#(host) docker run -it --net=host --cap-add=NET_RAW --cap-add=NET_ADMIN --ulimit memlock=-1 --device=/dev/infiniband xlio-image /bin/bash


Network Configuration Option 2: Custom SR-IOV Docker Network

To run the container in a separate network namespace from the host, you can use a custom Docker network.

NVIDIA provides an SR-IOV Docker plugin that facilitates the creation and management of such networks by automatically allocating and assigning a Virtual Function (VF) to the container.

  • Ensure that the SR-IOV Prerequisites listed above have been met.

Limitations:

    1. NVIDIA Docker SR-IOV plugin is supported ONLY on Linux environment on x86_64 and ppc64le platforms.

    2. Using a separate network namespace limits access to some /proc/sys net.core parameters fetched by XLIO, causing it to fall back on hardcoded default values and push a warning.

    3. SR-IOV Plugin ONLY works with ConnectX and BlueField series in NIC mode.

    1. QuickStart instructions for creating a custom Docker network with SR-IOV plugin:

a. Ensure you are using Docker 1.9 or later.

b. Pull the SR-IOV plugin (Mellanox/docker-sriov-plugin):

Copy
Copied!
            

#(host) docker pull rdma/sriov-plugin

c. Run the plugin:

Copy
Copied!
            

#(host) docker run -v /run/docker/plugins:/run/docker/plugins -v /etc/docker:/etc/docker -v /var/run:/var/run --net=host --privileged rdma/sriov-plugin

d. Create a new Docker Network using the SR-IOV plugin as driver. For example, using the ens2f0 PF-based net device (must be a PF-based interface):

Copy
Copied!
            

#(host) docker network create --driver sriov --subnet=<subnet> --gateway=<default-gateway-ip> -o netdevice=ens2f0 -o privileged=1 mynet

Notes:

      • If the custom network subnet has a default gateway, use it as <default-gateway-ip>; it will provide external connectivity. Otherwise, ignore the --gateway option.

      • It is important to create the SR-IOV Docker network after the SR-IOV plugin is already running.

  1. Run The Container (xlio-image):

    Copy
    Copied!
                

    #(host) docker run -it --net=mynet --ip=<picked-VF-interface-IP> --cap-add=NET_RAW --cap-add=NET_ADMIN --ulimit memlock=-1 --device=/dev/infiniband xlio-image /bin/bash

    Note: Ensure that the IP address you assign with --ip is a free IP address in the subnet to avoid conflicts within the subnet.

3. Verify Successfully Assigned VF network interface within the container:

Copy
Copied!
            

#(container) ip addr show 


[Optional] Connect to an additional Docker Network for External Access

If the custom SR-IOV network does not have a default gateway and you need access to external networks, you can connect the container to an additional Docker network (e.g. second SR-IOV network or default bridge network) to provide external access.

  1. Identify the container ID and connect xlio-image container to the bridge network:

    Copy
    Copied!
                

    #(host) docker ps -a #(host) docker network connect bridge <container-id>

  2. Verify that the container gained a new network interface to the bridge network:

    Copy
    Copied!
                

    #(container) ip addr show

  3. Verify that the default route uses the bridge network:

    Copy
    Copied!
                

    #(container) ip route

    • If the default gateway is incorrect, update the default route within the container:

      • Fetch Bridge Network Gateway IP:

        Copy
        Copied!
                    

        #(host) docker network inspect bridge | grep Gateway

      • Update the default route:

        Copy
        Copied!
                    

        #(container) ip route del default #(container) ip route add default via <bridge-gateway-ip> dev <bridge-interface>

XLIO requires a much higher max locked memory ulimit (ulimit -l) than the default. A container does not inherit the ulimits from the host (unless running in privileged mode), and changing the ulimit value within the container is not allowed. Therefore, it is preferable to set it to unlimited by running the container with:

--ulimit memlock=-1

where -1 means unlimited memory lock.

An additional option is to set the default ulimit value for the Docker daemon, which containers inherit (running a container with --ulimit will override the daemon --default_ulimit).

© Copyright 2024, NVIDIA. Last updated on Feb 6, 2025.