Getting Started with Red Hat OpenShift
Currently, NVIDIA Network Operator does not support Single Node OpenShift (SNO) deployments.
It is recommended to have dedicated control plane nodes for OpenShift deployments with NVIDIA Network Operator.
Node Feature Discovery
To enable Node Feature Discovery, please follow the official Guide.
An example of Node Feature Discovery configuration:
apiVersion: nfd.openshift.io/v1
kind: NodeFeatureDiscovery
metadata:
name: nfd-instance
namespace: openshift-nfd
spec:
operand:
namespace: openshift-nfd
image: registry.redhat.io/openshift4/ose-node-feature-discovery:v4.10
imagePullPolicy: Always
workerConfig:
configData: |
sources:
pci:
deviceClassWhitelist:
- "02"
- "03"
- "0200"
- "0207"
deviceLabelFields:
- vendor
customConfig:
configData: ""
Verify that the following label is present on the nodes containing NVIDIA networking hardware:
feature.node.kubernetes.io/pci-15b3.present=true
oc describe node | grep
-E 'Roles|pci'
| grep
-v
"control-plane"
Roles: worker
cpu-feature.node.kubevirt.io/invpcid=true
cpu-feature.node.kubevirt.io/pcid=true
feature.node.kubernetes.io/pci-102b.present=true
feature.node.kubernetes.io/pci-10de.present=true
feature.node.kubernetes.io/pci-10de.sriov.capable=true
feature.node.kubernetes.io/pci-14e4.present=true
feature.node.kubernetes.io/pci-15b3.present=true
feature.node.kubernetes.io/pci-15b3.sriov.capable=true
Roles: worker
cpu-feature.node.kubevirt.io/invpcid=true
cpu-feature.node.kubevirt.io/pcid=true
feature.node.kubernetes.io/pci-102b.present=true
feature.node.kubernetes.io/pci-10de.present=true
feature.node.kubernetes.io/pci-10de.sriov.capable=true
feature.node.kubernetes.io/pci-14e4.present=true
feature.node.kubernetes.io/pci-15b3.present=true
feature.node.kubernetes.io/pci-15b3.sriov.capable=true
SR-IOV Network Operator
If you are planning to use SR-IOV, follow this Guide to install SR-IOV Network Operator on an OpenShift Container Platform.
The SR-IOV resources created will have the openshift.io prefix.
For the default SriovOperatorConfig CR to work with the MLNX_OFED container, please run this command to update the following values:
oc patch sriovoperatorconfig default \
--type
=merge -n openshift-sriov-network-operator \
--patch '{ "spec": { "configDaemonNodeSelector": { "network.nvidia.com/operator.mofed.wait": "false", "node-role.kubernetes.io/worker": "", "feature.node.kubernetes.io/pci-15b3.sriov.capable": "true" } } }'
SR-IOV Network Operator configuration documentation can be found on the Official Website.
GPU Operator
If you plan to use GPUDirect, follow this Guide to install GPU Operator on an OpenShift Container Platform.
Make sure to enable RDMA and disable useHostMofed in the driver section in the spec of the ClusterPolicy CR.
Network Operator Installation
Network Operator Installation Using OpenShift Catalog
In the OpenShift Container Platform web console side menu, select Operators > OperatorHub, and search for the NVIDIA Network Operator.
Select NVIDIA Network Operator, and click Install in the first screen and in the subsequent one.
For additional information, see the Red Hat OpenShift Container Platform Documentation.
Network Operator Installation using OpenShift OC CLI
Create a namespace for the Network Operator.
Create the following Namespace custom resource (CR) that defines the nvidia-network-operator namespace, and then save the YAML in the network-operator-namespace.yaml file:
apiVersion: v1 kind: Namespace metadata: name: nvidia-network-operator
Create the namespace by running the following command:
oc create -f network-operator-namespace.yaml
Install the Network Operator in the namespace created in the previous step by creating the below objects. Run the following command to get the channel value required for the next step:
oc get packagemanifest nvidia-network-operator -n openshift-marketplace -o jsonpath=
'{.status.defaultChannel}'
Example Output
stable
Create the following Subscription CR, and save the YAML in the network-operator-sub.yaml file:
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: nvidia-network-operator namespace: nvidia-network-operator spec: channel:
"stable"
installPlanApproval: Manual name: nvidia-network-operator source: certified-operators sourceNamespace: openshift-marketplaceCreate the subscription object by running the following command:
oc create -f network-operator-sub.yaml
Change to the network-operator project:
oc project nvidia-network-operator
Verification
To verify that the operator deployment is successful, run:
oc get pods -n nvidia-network-operator
Example Output:
NAME READY STATUS RESTARTS AGE
nvidia-network-operator-controller-manager-8f8ccf45c-zgfsq 2
/2
Running 0
1m
A successful deployment shows a Running status.
Using Network Operator to Create NicClusterPolicy in an OpenShift Container Platform
See Deployment Examples for OCP:
Deployment Examples For OpenShift Container Platform
In OCP, some components are deployed by default like Multus and WhereAbouts, whereas others, such as NFD and SR-IOV Network Operator must be deployed manually, as described in the Installation section.
In addition, since there is no use of the Helm chart, the configuration should be done via the NicClusterPolicy CRD.
Following are examples of NicClusterPolicy configuration for OCP.
Network Operator Deployment with a Host Device Network - OCP
Network Operator deployment with:
SR-IOV device plugin, single SR-IOV resource pool:
There is no need for a secondary network configuration, as it is installed by default in the OCP.
apiVersion: mellanox.com/v1alpha1 kind: NicClusterPolicy metadata: name: nic-cluster-policy spec: ofedDriver: image: doca-driver repository: nvcr.io/nvidia/mellanox version:
24.01
-0.3
.3.1
-10
startupProbe: initialDelaySeconds:10
periodSeconds:20
livenessProbe: initialDelaySeconds:30
periodSeconds:30
readinessProbe: initialDelaySeconds:10
periodSeconds:30
sriovDevicePlugin: image: sriov-network-device-plugin repository: ghcr.io/k8snetworkplumbingwg version: v3.5.1
config: | {"resourceList"
: [ {"resourcePrefix"
:"nvidia.com"
,"resourceName"
:"hostdev"
,"selectors"
: {"vendors"
: ["15b3"
],"isRdma"
:true
} } ] }Following the deployment, the Network Operator should be configured, and K8s networking deployed to use it in pod configuration. The host-device-net.yaml configuration file for such a deployment:
apiVersion: mellanox.com/v1alpha1 kind: HostDeviceNetwork metadata: name: hostdev-net spec: networkNamespace:
"default"
resourceName:"nvidia.com/hostdev"
ipam: | {"type"
:"whereabouts"
,"datastore"
:"kubernetes"
,"kubernetes"
: {"kubeconfig"
:"/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
},"range"
:"192.168.3.225/28"
,"exclude"
: ["192.168.3.229/30"
,"192.168.3.236/32"
],"log_file"
:"/var/log/whereabouts.log"
,"log_level"
:"info"
}The pod.yaml configuration file for such a deployment:
apiVersion: v1 kind: Pod metadata: name: hostdev-test-pod annotations: k8s.v1.cni.cncf.io/networks: hostdev-net spec: restartPolicy: OnFailure containers: - image: <rdma image> name: mofed-test-ctr securityContext: capabilities: add: [
"IPC_LOCK"
] resources: requests: nvidia.com/hostdev:1
limits: nvidia.com/hostdev:1
command: - sh - -c - sleep inf
Network Operator Deployment with SR-IOV Legacy Mode - OCP
This deployment mode supports SR-IOV in legacy mode.
Note that the SR-IOV Network Operator is required as described in the Deployment for OCP section.
apiVersion: mellanox.com/v1alpha1
kind: NicClusterPolicy
metadata:
name: nic-cluster-policy
spec:
ofedDriver:
image: doca-driver
repository: nvcr.io/nvidia/mellanox
version: 24.01
-0.3
.3.1
-10
startupProbe:
initialDelaySeconds: 10
periodSeconds: 20
livenessProbe:
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
initialDelaySeconds: 10
periodSeconds: 30
Sriovnetwork node policy and K8s networking should be deployed.
sriovnetwork-node-policy.yaml
configuration file for such a deployment:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-1
namespace: openshift-sriov-network-operator
spec:
deviceType: netdevice
mtu: 1500
nicSelector:
vendor: "15b3"
pfNames: ["ens2f0"
]
nodeSelector:
feature.node.kubernetes.io/pci-15b3.present: "true"
numVfs: 8
priority: 90
isRdma: true
resourceName: sriovlegacy
The sriovnetwork.yaml configuration file for such a deployment:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: "sriov-network"
namespace: openshift-sriov-network-operator
spec:
vlan: 0
networkNamespace: "default"
resourceName: "sriovlegacy"
ipam: |-
{
"datastore"
: "kubernetes"
,
"kubernetes"
: {
"kubeconfig"
: "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
},
"log_file"
: "/tmp/whereabouts.log"
,
"log_level"
: "debug"
,
"type"
: "whereabouts"
,
"range"
: "192.168.101.0/24"
}
Note that the resource prefix in this case will be openshift.io.
The pod.yaml configuration file for such a deployment:
apiVersion: v1
kind: Pod
metadata:
name: testpod1
annotations:
k8s.v1.cni.cncf.io/networks: sriov-network
spec:
containers:
- name: appcntr1
image: <image>
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add: ["IPC_LOCK"
]
command:
- sh
- -c
- sleep inf
resources:
requests:
openshift.io/sriovlegacy: '1'
limits:
openshift.io/sriovlegacy: '1'
Network Operator Deployment with the RDMA Shared Device Plugin - OCP
The following is an example of RDMA Shared with MacVlanNetwork:
apiVersion: mellanox.com/v1alpha1
kind: NicClusterPolicy
metadata:
name: nic-cluster-policy
spec:
ofedDriver:
image: doca-driver
repository: nvcr.io/nvidia/mellanox
version: 24.01
-0.3
.3.1
-10
startupProbe:
initialDelaySeconds: 10
periodSeconds: 20
livenessProbe:
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
initialDelaySeconds: 10
periodSeconds: 30
rdmaSharedDevicePlugin:
config: |
{
"configList"
: [
{
"resourceName"
: "rdmashared"
,
"rdmaHcaMax"
: 1000
,
"selectors"
: {
"ifNames"
: ["enp4s0f0np0"
]
}
}
]
}
image: k8s-rdma-shared-dev-plugin
repository: nvcr.io/nvidia/cloud-native
version: v1.3.2
The macvlan-net-ocp.yaml configuration file for such a deployment in an OpenShift Platform:
apiVersion: mellanox.com/v1alpha1
kind: MacvlanNetwork
metadata:
name: rdmashared-net
spec:
networkNamespace: default
master: enp4s0f0np0
mode: bridge
mtu: 1500
ipam: '{"type": "whereabouts", "range": "16.0.2.0/24", "gateway": "16.0.2.1"}'
The pod.yaml configuration file for such a deployment:
apiVersion: v1
kind: Pod
metadata:
name: test-rdma-shared-1
annotations:
k8s.v1.cni.cncf.io/networks: rdmashared-net
spec:
containers:
- image: myimage
name: rdma-shared-1
securityContext:
capabilities:
add:
- IPC_LOCK
resources:
limits:
rdma/rdmashared: 1
requests:
rdma/rdmashared: 1
restartPolicy: OnFailure
Network Operator Deployment for DPDK Workloads - OCP
In order to configure HUGEPAGES in OpenShift, refer to this Guide.
For Network Operator configuration instructions, visit the Official Website.