Install NVIDIA Network Operator (Optional)

NVIDIA AI Enterprise 2.0 or later

Next, we will install the NVIDIA Network Operator. This only applicable if you worker nodes have NVIDIA Networking. The Network Operator’s goal is to install the host networking components required to enable RDMA and GPUDirect in a Kubernetes cluster. It does so by configuring a high-speed data path for IO intensive workloads on a secondary network in each cluster node.

  1. Select Operators > Operator Hub, and search for the NVIDIA Network Operator.

  2. Select the NVIDIA Network Operator, and click Install in the first screen and in the subsequent one.

    Note

    For additional information, see the Red Hat OpenShift Container Platform Documentation.


The NVIDIA Network Operator can also be installed using CLI. The steps are provided for informational purposes.

  1. Create a namespace for the Network Operator.

    Create the following Namespace custom resource (CR) that defines the network-operator namespace, and then save the YAML in the network-operator-namespace.yaml file:

    Copy
    Copied!
                

    apiVersion: v1 kind: Namespace metadata: name: network-operator


    Create the namespace by running the following command:

    Copy
    Copied!
                

    $oc create -f network-operator-namespace.yaml


  2. Install the Network Operator in the namespace you created in the previous step by creating the below objects.

    Run the following command to get the channel value required for the next step:

    Copy
    Copied!
                

    $oc get packagemanifest network-operator -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'


  3. Example Output:

    Copy
    Copied!
                

    stable


  4. Create the following Subscription CR, and save the YAML in the network-operator-sub.yaml file:

    Copy
    Copied!
                

    apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: network-operator namespace: network-operator spec: channel: "stable" installPlanApproval: Manual name: network-operator sourceNamespace: openshift-marketplace


  5. Create the subscription object by running the following command:

    Copy
    Copied!
                

    $oc create -f network-operator-sub.yaml


  6. Change to the network-operator project:

    Copy
    Copied!
                

    $oc project network-operator


    To verify that the operator deployment is successful, run:

    Copy
    Copied!
                

    $oc get pods


    Example Output:

    Copy
    Copied!
                

    NAME                                      READY   STATUS    RESTARTS   AGE vidia-network-operator-controller-manager-8f8ccf45c-zgfsq    2/2     Running   0          1


  7. A successful deployment shows a Running status.

© Copyright 2022-2023, NVIDIA. Last updated on Jan 9, 2023.