Install NVIDIA Network Operator (Optional)#
Added in version 2.0.
Next, we will install the NVIDIA Network Operator. This only applicable if you worker nodes have NVIDIA Networking. The Network Operator’s goal is to install the host networking components required to enable RDMA and GPUDirect in a Kubernetes cluster. It does so by configuring a high-speed data path for IO intensive workloads on a secondary network in each cluster node.
Select Operators > Operator Hub, and search for the NVIDIA Network Operator.
Select the NVIDIA Network Operator, and click Install in the first screen and in the subsequent one.
Note
For additional information, see the Red Hat OpenShift Container Platform Documentation.
Install NVIDIA Network Operatior via CLI#
The NVIDIA Network Operator can also be installed using CLI. The steps are provided for informational purposes.
Create a namespace for the Network Operator.
Create the following Namespace custom resource (CR) that defines the network-operator namespace, and then save the YAML in the
network-operator-namespace.yaml
file:1apiVersion: v1 2kind: Namespace 3metadata: 4name: network-operator
Create the namespace by running the following command:
$ oc create -f network-operator-namespace.yaml
Install the Network Operator in the namespace you created in the previous step by creating the below objects.
Run the following command to get the channel value required for the next step:
$ oc get packagemanifest network-operator -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'
Example Output:
stable
Create the following Subscription CR, and save the YAML in the
network-operator-sub.yaml
file:1apiVersion: operators.coreos.com/v1alpha1 2kind: Subscription 3metadata: 4 name: network-operator 5 namespace: network-operator 6spec: 7 channel: "stable" 8 installPlanApproval: Manual 9 name: network-operator 10 sourceNamespace: openshift-marketplace
Create the subscription object by running the following command:
$ oc create -f network-operator-sub.yaml
Change to the network-operator project:
$ oc project network-operator
To verify that the operator deployment is successful, run:
$ oc get pods
Example Output:
NAME READY STATUS RESTARTS AGE vidia-network-operator-controller-manager-8f8ccf45c-zgfsq 2/2 Running 0 1
A successful deployment shows a Running status.