On This Page
- NVIDIA DOCA-OFED Driver Container Environment Variables
- Precompiled Container Build Instructions for NVIDIA DOCA-OFED Driver Container
NVIDIA DOCA-OFED Driver Container
The following are special environment variables supported by the NVIDIA DOCA-OFED Driver container to configure its behavior:
Name
Default
Description
|CREATE_IFNAMES_UDEV
|
* “true” for Ubuntu 20.04, RHEL v8.x and OCP <= v4.13.
|Create an udev rule to preserve “old-style” path based netdev names e.g enp3s0f0
|UNLOAD_STORAGE_MODULES
|“false”
|
Unload host storage modules prior to loading NVIDIA DOCA-OFED Driver modules:
|ENABLE_NFSRDMA
|“false”
|Enable loading of NFS related storage modules from a NVIDIA DOCA-OFED Driver container
|RESTORE_DRIVER_ON_POD_TERMINATION
|“false”
|Restore host drivers when a container
In addition, it is possible to specify any environment variables to be exposed to the NVIDIA DOCA-OFED Driver container, such as the standard “HTTP_PROXY”, “HTTPS_PROXY”, “NO_PROXY”.
CREATE_IFNAMES_UDEV Environment Variable
CREATE_IFNAMES_UDEV is set automatically by the Network Operator, depending on the Operating System of the worker nodes in the cluster (the cluster is assumed to be homogenous).
ENABLE_NFSRDMA Environment Variable
In context of GPU Direct Storage (GDS) feature, only GDS with NFS over RDMA is supported. In this case, NVME over RDMA cannot be used. It is not possible to load inbox modules since they depend on ib_core which does not match (symbol error). Only NVME with local drive is supported.
Example of NicClusterPolicy
These variables can be set in the NicClusterPolicy. For example:
kind: NicClusterPolicy
metadata:
name: nic-cluster-policy
spec:
ofedDriver:
env:
- name: RESTORE_DRIVER_ON_POD_TERMINATION
value: "false"
- name: UNLOAD_STORAGE_MODULES
value: "true"
- name: CREATE_IFNAMES_UDEV
value: "true"
Prerequisites
Before you begin, ensure that you have the following prerequisites:
Docker (Ubuntu) / Podman (RH) installed on your build system.
Web access to NVIDIA NIC drivers sources. Latest NIC drivers are published at NVIDIA DOCA Downloads, for example: https://linux.mellanox.com/public/repo/doca/2.10.0/SOURCES/MLNX_OFED/MLNX_OFED_SRC-debian-25.01-0.6.0.0.tgz
NOTE: NVIDIA NIC driver sources are bundled as part of NVIDIA DOCA package. Both the DOCA package version and its corresponding NIC driver (DOCA-OFED Driver) version need to be specified to fetch the correct driver sources when building the driver container. For example, given a DOCA package version (e.g 2.10.0) you can find the corresponding MLNX_OFED version at the link: https://linux.mellanox.com/public/repo/doca/2.10.0/SOURCES/MLNX_OFED/ which is 25.01-0.6.0.0’
Download Docker files and scripts:
wget https://raw.githubusercontent.com/Mellanox/doca-driver-build/ebde0772891747a60bdf471a92228de8ba54b62c/RHEL_Dockerfile
wget https://raw.githubusercontent.com/Mellanox/doca-driver-build/ebde0772891747a60bdf471a92228de8ba54b62c/Ubuntu_Dockerfile
wget https://raw.githubusercontent.com/Mellanox/doca-driver-build/ebde0772891747a60bdf471a92228de8ba54b62c/SLES_Dockerfile
wget https://raw.githubusercontent.com/Mellanox/doca-driver-build/ebde0772891747a60bdf471a92228de8ba54b62c/entrypoint.sh
wget https://raw.githubusercontent.com/Mellanox/doca-driver-build/ebde0772891747a60bdf471a92228de8ba54b62c/dtk_nic_driver_build.sh
chmod +x entrypoint.sh dtk_nic_driver_build.sh
Dockerfile Overview
To build the precompiled container, the Dockerfile is constructed in a multistage fashion. This approach is used to optimize the resulting container image size and reduce the number of dependencies included in the final image.
The Dockerfile consists of the following stages:
Base Image Update: The base image is updated and common requirements are installed. This stage sets up the basic environment for the subsequent stages.
Download Driver Sources: This stage downloads the NVIDIA DOCA-OFED Driver sources to the specified path. It prepares the necessary files for the driver build process.
Build Driver: The driver is built using the downloaded sources and installed on the container. This stage ensures that the driver is compiled and configured correctly for the target system.
Install precompiled driver: Finally, the precompiled driver is installed on clean container. This stage sets up the environment to run the NVIDIA NIC drivers on the target system.
Common mandatory build parameters
Before building the container, you need to provide following parameters as build-arg for container build:
D_OS: The Linux distribution (e.g., ubuntu22.04 / rhel9.2)
D_ARCH: Compiled Architecture
D_BASE_IMAGE: Base container image (e.g., ubuntu:22.04)
D_KERNEL_VER: The target kernel version (e.g., 5.15.0-25-generic / 5.14.0-284.32.1.el9_2.x86_64)
D_DOCA_VERSION: NVIDIA DOCA version (e.g., 2.9.1)
D_OFED_VERSION: NVIDIA NIC drivers version (e.g., 24.10-1.1.4.0)
NOTE: Check desired NVIDIA NIC drivers sources availability for designated container OS, only versions available on download page can be utilized
NOTE: For proper Network Operator functionality container tag name must be in following pattern: doca<doca_version>-<driver_ver>-<container_ver>-<kernel_ver-os-arch>. For example: doca2.9.1-24.10-1.1.4.0-0-5.15.0-25-generic-ubuntu22.04-amd64
NOTE: Dockerfiles contain default build parameters, which may fail build proccess on your system if not overridden.
Modification of D_OFED_SRC_DOWNLOAD_PATH must be tighdly coupled with corresponding update to entrypoint.sh script.
RHCOS
Prerequisites
Install oc CLI tool.
Download OpenShift pull secret.
Specific build parameters
D_BASE_IMAGE: DriverToolKit container image
NOTE: DTK (DriverToolKit) is tightly coupled with specific kernel version for an OpenShift release. In order to get the specific DTK container image for a specific OpenShift release, run:
oc adm release info <OCP_VERSION> --image-for=driver-toolkit
For example, for OpenShift 4.16.0:
oc adm release info 4.16.0 --image-for=driver-toolkit
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dde3cd6a75d865a476aa7e1cab6fa8d97742401e87e0d514f3042c3a881e301f
Then pull the DTK image locally using your pull-secret:
podman pull --authfile=/path/to/pull-secret.txt docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dde3cd6a75d865a476aa7e1cab6fa8d97742401e87e0d514f3042c3a881e301f
D_FINAL_BASE_IMAGE: Final container image, to install compiled driver
D_ARCH: Target architecture: x86_64 or aarch64.
D_KERNEL_VER: CoreOS kernel versions for OpenShift are listed here.
Kernel version can also be found with the DTK image using the following command:
podman run --rm -ti quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:47ba8a10d5938c41f907ee7f70d74aecb9c2dfd7afae4cea2942fc8f47256612 cat /etc/driver-toolkit-release.json | jq -r '.KERNEL_VERSION'
5.14.0-427.22.1.el9_4.x86_64
RHCOS example
NOTE: Since OCP 4.19, RHCOS is based on RHEL 9.6, therefore the tag should include the RHEL version instead of the OCP version.
podman build \
--build-arg D_OS=rhcos4.16 \
--build-arg D_ARCH=x86_64 \
--build-arg D_KERNEL_VER=5.14.0-427.22.1.el9_4.x86_64 \
--build-arg D_DOCA_VERSION=2.9.1 \
--build-arg D_OFED_VERSION=24.10-1.1.4.0 \
--build-arg D_BASE_IMAGE="quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dde3cd6a75d865a476aa7e1cab6fa8d97742401e87e0d514f3042c3a881e301f" \
--build-arg D_FINAL_BASE_IMAGE=registry.access.redhat.com/ubi9/ubi:9.4 \
--tag doca2.9.1-24.10-1.1.4.0-0-5.14.0-427.22.1.el9_4.x86_64-rhcos4.16-amd64 \
-f RHEL_Dockerfile \
--target precompiled .
Ubuntu
Ubuntu example
docker build \
--build-arg D_OS=ubuntu22.04 \
--build-arg D_ARCH=x86_64 \
--build-arg D_BASE_IMAGE=ubuntu:22.04 \
--build-arg D_KERNEL_VER=5.15.0-25-generic \
--build-arg D_DOCA_VERSION=2.9.1 \
--build-arg D_OFED_VERSION=24.10-1.1.4.0 \
--tag doca2.9.1-24.10-1.1.4.0-0-5.15.0-25-generic-ubuntu22.04-amd64 \
-f Ubuntu_Dockerfile \
--target precompiled .
SLES
Prerequisites
Active subscription. After registering, make sure to run zypper refresh && zypper update -y.
SLES example
docker build \
--build-arg D_OS=sles15.5 \
--build-arg D_ARCH=x86_64 \
--build-arg D_BASE_IMAGE=registry.suse.com/suse/sle15:15.5 \
--build-arg D_KERNEL_VER=5.14.21-150500.55.83-default \
--build-arg D_DOCA_VERSION=2.9.1 \
--build-arg D_OFED_VERSION=24.10-1.1.4.0 \
--tag doca2.9.1-24.10-1.1.4.0-0-5.14.21-150500.55.83-default-sles15.5-amd64 \
-f SLES_Dockerfile \
--target precompiled .