Setting Up XLIO Within a Docker Container
This guide offers instructions for running NVIDIA Accelerated IO (XLIO) in Docker containers using NVIDIA DOCA Software Framework.
NVIDIA NICs utilize Single Root IO Virtualization (SR-IOV), a technology that allows a physical NIC (PF) to present multiple virtual instances (Virtual Functions or VFs).
Each VF is a lightweight instance of the PF appearing as a distinct network interface with no additional overhead.
PFs are typically used in the host network stack, while VFs are typically used in virtualized environments like Virtual Machines and Docker containers.
Check XLIO System Requirements for details on supported NICs and more.
Ensure all necessary kernel drivers for NVIDIA hardware, including Mellanox NICs, are installed on the host before deploying containers.
If SR-IOV is required, ensure it is enabled on SR-IOV-capable NIC on the host. Refer to Docker Using SR-IOV for more instructions.
Make sure you're using the latest DOCA-Host version.
Pull the desired container image:
#(host) docker pull <container-image> # e.g. docker pull ubuntu:
22.04
Run the container image:
#(host) docker run -it <container-image> /bin/bash
Install the required packages inside the container:
For DEB-based distributions (e.g. Ubuntu):
#(container) <your-
package
-manager> update #(container) <your-package
-manager> -y install curl gpg iproute2 iputils-pingFor RPM-based distributions (e.g. RH and Oracle Linux):
#(container) <your-
package
-manager> update #(container) <your-package
-manager> -y install iproute iputils
Install libxlio library and its dependencies inside the container. For more information, see Installing XLIO as libxlio Profile.
Install sockperf utility inside the container:
-
#(container) <your-
package
-manager> -y install sockperf
-
Exit and Save the container image:
Exit the container:
#(container) exit
Identify the container ID and commit the container to a new image:
#(host) docker ps -a #(host) docker commit -m
"Added XLIO"
<CONTAINER-ID> xlio-image
This section provides instructions on how to run and configure the newly created xlio-image
Docker image, detailing both the required and some of the optional configurations for running the container.
Required Configurations
When running the container with docker run
, the following configurations must be specified to ensure proper functionality:
--ulimit memlock=-1
: This option allows you to set unlimited memory lock for the container. For more details, refer to the ULIMIT Considerations section below.--device=/dev/infiniband
: This option grants the container access to available InfiniBand devices.Note: Instead of granting access to all available IB devices, you can grant access to specific InfiniBand devices by running:
--device=/dev/infiniband/rdma_cm --device=/dev/infiniband/uverbs0
Note: Make sure to use the correct
uverbsX
in case you have multiple InfiniBand devices.--cap-add=NET_RAW NET_ADMIN
: This capability provides the ability to configure and manage network interfaces, in addition to Raw Socket processing.
Optional Configurations
XLIO Huge Pages Configuration
XLIO can take advantage of Huge Pages to optimize memory allocation and takes full advantage of the performance benefits of Huge Pages.
Huge Pages are allocated by the host. Once configured, the container can access and use them from the host's memory pool.
To check the current Huge Page configuration settings:
#(host) cat /proc/meminfo | grep HugePages
To allocate a sufficient amount of Huge Pages (preferably a total of 2GB memory):
#(host) echo <number_of_hugepages> | sudo tee /proc/sys/vm/nr_hugepages
To allocate a total of approximately 2GB of huge pages, determine the size of your system's hugepages (usually 2MB) and calculate the required number.
For example, if your system uses 2MB huge pages, you would need to allocate 1024 huge pages to reach a total of 2GB.
Network Configurations
Network Configuration Option 1: Host Network
Use the host's network stack, which directly connects the container to the host's networking environment:
Run The Container (xlio-image
):
#(host) docker run -it --net=host --cap-add=NET_RAW --cap-add=NET_ADMIN --ulimit memlock=-1
--device=/dev/infiniband xlio-image /bin/bash
Network Configuration Option 2: Custom SR-IOV Docker Network
To run the container in a separate network namespace from the host, you can use a custom Docker network.
NVIDIA provides an SR-IOV Docker plugin that facilitates the creation and management of such networks by automatically allocating and assigning a Virtual Function (VF) to the container.
Ensure that the SR-IOV Prerequisites listed above have been met.
Limitations:
NVIDIA Docker SR-IOV plugin is supported ONLY on Linux environment on x86_64 and ppc64le platforms.
Using a separate network namespace limits access to some /proc/sys
net.core
parameters fetched by XLIO, causing it to fall back on hardcoded default values and push a warning.SR-IOV Plugin ONLY works with ConnectX and BlueField series in NIC mode.
QuickStart instructions for creating a custom Docker network with SR-IOV plugin:
a. Ensure you are using Docker 1.9 or later.
b. Pull the SR-IOV plugin (Mellanox/docker-sriov-plugin):
#(host) docker pull rdma/sriov-plugin
c. Run the plugin:
#(host) docker run -v /run/docker/plugins:/run/docker/plugins -v /etc/docker:/etc/docker -v /var/run:/var/run --net=host --privileged rdma/sriov-plugin
d. Create a new Docker Network using the SR-IOV plugin as driver. For example, using the ens2f0
PF-based net device (must be a PF-based interface):
#(host) docker network create --driver sriov --subnet=<subnet> --gateway=<default
-gateway-ip> -o netdevice=ens2f0 -o privileged=1
mynet
Notes:
If the custom network subnet has a default gateway, use it as
<default-gateway-ip>
; it will provide external connectivity. Otherwise, ignore the--gateway
option.It is important to create the SR-IOV Docker network after the SR-IOV plugin is already running.
Run The Container (
xlio-image
):#(host) docker run -it --net=mynet --ip=<picked-VF-
interface
-IP> --cap-add=NET_RAW --cap-add=NET_ADMIN --ulimit memlock=-1
--device=/dev/infiniband xlio-image /bin/bashNote: Ensure that the IP address you assign with
--ip
is a free IP address in the subnet to avoid conflicts within the subnet.
3. Verify Successfully Assigned VF network interface within the container:
#(container) ip addr show
[Optional] Connect to an additional Docker Network for External Access
If the custom SR-IOV network does not have a default gateway and you need access to external networks, you can connect the container to an additional Docker network (e.g. second SR-IOV network or default bridge network) to provide external access.
Identify the container ID and connect
xlio-image
container to the bridge network:#(host) docker ps -a #(host) docker network connect bridge <container-id>
Verify that the container gained a new network interface to the bridge network:
#(container) ip addr show
Verify that the default route uses the bridge network:
#(container) ip route
If the default gateway is incorrect, update the default route within the container:
Fetch Bridge Network Gateway IP:
#(host) docker network inspect bridge | grep Gateway
Update the default route:
#(container) ip route del
default
#(container) ip route adddefault
via <bridge-gateway-ip> dev <bridge-interface
>
XLIO requires a much higher max locked memory ulimit (ulimit -l
) than the default. A container does not inherit the ulimits from the host (unless running in privileged mode), and changing the ulimit value within the container is not allowed. Therefore, it is preferable to set it to unlimited by running the container with:
--ulimit memlock=-1
where -1
means unlimited memory lock.
An additional option is to set the default ulimit value for the Docker daemon, which containers inherit (running a container with --ulimit
will override the daemon --default_ulimit
).