NVIDIA Network Operator v26.4.0

DRA SR-IOV Driver

Dynamic Resource Allocation (DRA) is a Kubernetes concept for flexibly requesting, configuring, and sharing specialized devices like SR-IOV network interfaces. DRA puts device configuration and scheduling into the hands of device vendors through drivers such as the DRA Driver for SR-IOV. This page outlines how to install the NVIDIA DRA Driver for SR-IOV with the NVIDIA Network Operator.

Note

The DRA Driver for SR-IOV is a Tech Preview feature: it has limited testing and is not recommended for production deployments. See Platform Support for support-tier definitions.

Before using the DRA Driver for SR-IOV, it is recommended that you are familiar with the following concepts:

With DRA Driver for SR-IOV, your Kubernetes workload can allocate and consume SR-IOV Virtual Functions (VFs) from supported NVIDIA network adapters using the native Kubernetes DRA framework.

You can use the DRA Driver for SR-IOV with the SR-IOV Network Operator to deploy and manage your SR-IOV network resources.

Limitations

Warning

This feature is supported only for Vanilla Kubernetes deployments with SR-IOV Network Operator.

Warning

On GB300, Vera Rubin, and Fractal systems, the PCIe root used to match a NIC to a GPU is not the root of the NIC itself. Instead, it is the PCIe root of the NIC’s Data Direct sub-interface. This applies to ConnectX-8 and later adapters. The DRA SR-IOV driver does not currently support this topology.

Warning

Running the DRA driver and the SR-IOV device plugin on the same cluster at the same time is not supported. When DRA is enabled, the SR-IOV device plugin will not run. It is recommended to delete any existing SriovNetworkNodePolicy resources before enabling DRA.

First install the Network Operator with NFD, SR-IOV Network Operator, and DRA enabled: values.yaml:

Copy
Copied!
            

nfd: enabled: true sriovNetworkOperator: enabled: true sriovOperatorConfig: featureGates: dynamicResourceAllocation: true

Disable the SR-IOV Resources Injector to avoid conflicts with the DRA Driver for SR-IOV:

Copy
Copied!
            

kubectl patch sriovoperatorconfig default -n nvidia-network-operator --type merge -p '{"spec":{"enableInjector":false}}'

Step 1: Create NicClusterPolicy

Copy
Copied!
            

apiVersion: mellanox.com/v1alpha1 kind: NicClusterPolicy metadata: name: nic-cluster-policy spec: nvIpam: image: nvidia-k8s-ipam repository: nvcr.io/nvidia/mellanox version: network-operator-v26.4.0 enableWebhook: false secondaryNetwork: cniPlugins: image: plugins repository: nvcr.io/nvidia/mellanox version: network-operator-v26.4.0 multus: image: multus-cni repository: nvcr.io/nvidia/mellanox version: network-operator-v26.4.0

Copy
Copied!
            

kubectl apply -f nicclusterpolicy.yaml

Step 2: Create IPPool for nv-ipam

Copy
Copied!
            

apiVersion: nv-ipam.nvidia.com/v1alpha1 kind: IPPool metadata: name: sriov-pool namespace: nvidia-network-operator spec: subnet: 192.168.2.0/24 perNodeBlockSize: 50 gateway: 192.168.2.1

Copy
Copied!
            

kubectl apply -f ippool.yaml

Step 3: Configure SR-IOV

Copy
Copied!
            

apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: ethernet-sriov namespace: nvidia-network-operator spec: deviceType: netdevice mtu: 1500 nodeSelector: feature.node.kubernetes.io/pci-15b3.present: "true" nicSelector: vendor: "15b3" isRdma: true numVfs: 8 priority: 90 resourceName: sriov_resource

Copy
Copied!
            

kubectl apply -f sriovnetworknodepolicy.yaml

Wait for the SriovNetworkNodeState CRs to reach the Synced state:

Copy
Copied!
            

kubectl get sriovnetworknodestates -n nvidia-network-operator

Verify that ResourceSlices are created:

Copy
Copied!
            

kubectl get resourceslices

The following is an example of a ResourceSlice created by the DRA SR-IOV driver, showing a single Virtual Function with its attributes:

Copy
Copied!
            

apiVersion: resource.k8s.io/v1 kind: ResourceSlice metadata: generateName: c-237-177-60-062-sriovnetwork.k8snetworkplumbingwg.io- name: c-237-177-60-062-sriovnetwork.k8snetworkplumbingwg.io-t4mc5 ownerReferences: - apiVersion: v1 controller: true kind: Node name: c-237-177-60-062 spec: devices: - attributes: dra.net/numaNode: int: 0 resource.kubernetes.io/pciBusID: string: "0000:08:00.4" resource.kubernetes.io/pcieRoot: string: pci0000:00 sriovnetwork.k8snetworkplumbingwg.io/EswitchMode: string: legacy sriovnetwork.k8snetworkplumbingwg.io/PFName: string: eth2 sriovnetwork.k8snetworkplumbingwg.io/deviceID: string: 101e sriovnetwork.k8snetworkplumbingwg.io/linkType: string: ethernet sriovnetwork.k8snetworkplumbingwg.io/parentPciAddress: string: "0000:00:00.0" sriovnetwork.k8snetworkplumbingwg.io/pciAddress: string: "0000:08:00.4" sriovnetwork.k8snetworkplumbingwg.io/pfDeviceID: string: 101d sriovnetwork.k8snetworkplumbingwg.io/vendor: string: 15b3 sriovnetwork.k8snetworkplumbingwg.io/vfID: int: 2 k8s.cni.cncf.io/resourceName: string: nvidia.com/sriov_resource k8s.cni.cncf.io/deviceId: string: "0000:08:00.4" name: 0000-08-00-4

Step 4: Create SR-IOV Network

Copy
Copied!
            

apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetwork metadata: name: sriov-rdma-network namespace: nvidia-network-operator spec: ipam: | { "type": "nv-ipam", "poolName": "sriov-pool" } networkNamespace: default resourceName: sriov_resource

Copy
Copied!
            

kubectl apply -f sriovnetwork.yaml

Step 5: Create ResourceClaimTemplate

Copy
Copied!
            

apiVersion: resource.k8s.io/v1 kind: ResourceClaimTemplate metadata: name: sriov-vf spec: spec: devices: requests: - name: vf exactly: deviceClassName: sriovnetwork.k8snetworkplumbingwg.io count: 1 selectors: - cel: expression: > device.attributes["k8s.cni.cncf.io"].resourceName == "nvidia.com/sriov_resource"

Copy
Copied!
            

kubectl apply -f resourceclaimtemplate.yaml

Step 6: Deploy test workload

Copy
Copied!
            

--- apiVersion: v1 kind: Pod metadata: name: sriov-rdma-server namespace: default labels: app: sriov-rdma role: server annotations: k8s.v1.cni.cncf.io/networks: sriov-rdma-network spec: tolerations: - key: "node-role.kubernetes.io/control-plane" operator: "Exists" effect: "NoSchedule" - key: "node-role.kubernetes.io/master" operator: "Exists" effect: "NoSchedule" restartPolicy: Never containers: - name: rdma-test image: nvcr.io/nvidia/doca/doca:3.1.0-full-rt-host command: ["/bin/bash", "-c", "sleepinfinity"] securityContext: capabilities: add: ["IPC_LOCK"] privileged: true resources: claims: - name: vf resourceClaims: - name: vf resourceClaimTemplateName: sriov-vf --- apiVersion: v1 kind: Pod metadata: name: sriov-rdma-client namespace: default labels: app: sriov-rdma role: client annotations: k8s.v1.cni.cncf.io/networks: sriov-rdma-network spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: role operator: In values: - server topologyKey: kubernetes.io/hostname restartPolicy: Never containers: - name: rdma-test image: nvcr.io/nvidia/doca/doca:3.1.0-full-rt-host command: ["/bin/bash", "-c", "sleepinfinity"] securityContext: capabilities: add: ["IPC_LOCK"] privileged: true resources: claims: - name: vf resourceClaims: - name: vf resourceClaimTemplateName: sriov-vf

Copy
Copied!
            

kubectl apply -f pod.yaml

DRA enables end users to select resources from different DRA drivers with matching attributes to achieve maximum performance. By using constraints with matchAttribute, the Kubernetes scheduler ensures that allocated devices share a common topology, such as the same PCIe root complex.

The following example shows a ResourceClaimTemplate that requests both an SR-IOV VF and a GPU from the NVIDIA DRA Driver for GPUs, constrained to share the same PCIe root:

Copy
Copied!
            

apiVersion: resource.k8s.io/v1 kind: ResourceClaimTemplate metadata: name: resource-alignment spec: spec: devices: requests: - name: vf exactly: deviceClassName: sriovnetwork.k8snetworkplumbingwg.io selectors: - cel: expression: > device.attributes["k8s.cni.cncf.io"].resourceName == "nvidia.com/sriov_resource" - name: gpu exactly: deviceClassName: gpu.nvidia.com count: 1 constraints: - matchAttribute: "resource.kubernetes.io/pcieRoot" requests: [vf, gpu]

Previous KubeVirt SR-IOV Integration
Next Getting Started with OpenShift
© Copyright 2025-2026, NVIDIA. Last updated on Jun 14, 2026