Installing NVIDIA NIM Operator on Red Hat OpenShift#

Prerequisites#

A Red Hat OpenShift Container Platform cluster and the cluster-admin role. Refer to Platform Support for information about supported operating systems and Kubernetes platforms.
An installation of the Operator SDK and the operator-sdk command in your path. Refer to Installation in the Operator SDK documentation for more information.
OpenShift CLI. Refer to Installing the OpenShift CLI in the OpenShift documentation for more information.
NVIDIA A100 80 GB, H100, or L40S GPUs on one or more nodes. Refer to Platform Support for information about models and required GPU model and GPU count. For large models that exceed the memory capacity of one GPU, you need to add more GPUs. When you deploy a pipeline, you can specify more than one GPU for a workload.
An NGC CLI API key. Pods use the API key as an image pull secret to download container images and models from NVIDIA NGC. Refer to Generating Your NGC API Key in the NVIDIA NGC User Guide for more information.
An active subscription to an NVIDIA AI Enterprise product or be an NVIDIA Developer Program member. Access to the containers and models for NVIDIA NIM microservices is restricted.

Installing GPU Operator#

Use the NVIDIA GPU Operator to install, configure, and manage the NVIDIA GPU driver and NVIDIA container runtime on the Kubernetes nodes.

Install the Node Feature Discovery Operator.
- Refer to Node Feature Discovery Operator in the OpenShift Container Platform documentation for installation information.
- Refer to Installing the Node Feature Discovery Operator on OpenShift for information about creating a node feature discovery instance.
Install the GPU Operator.

Refer to Installing the NVIDIA GPU Operator on OpenShift for information about installing the Operator and creating a cluster policy instance.

Install NIM Operator#

Create the Operator namespace:

$ oc create namespace nvidia-nim-operator

Add a Docker registry secret that the Operator uses for pulling containers and models from NGC:

$ oc create secret -n nvidia-nim-operator docker-registry ngc-secret \
    --docker-server=nvcr.io \
    --docker-username='$oauthtoken' \
    --docker-password=<ngc-api-key>

Install the Operator:

$ operator-sdk run bundle ghcr.io/nvidia/k8s-nim-operator:bundle-latest-main --namespace nvidia-nim-operator

Optional: Confirm the controller pod is running:

$ oc get pods -n nvidia-nim-operator

Example Output

NAME                                                              READY   STATUS      RESTARTS   AGE
ec60a4439c710b89fc2582f5384382b4241f9aee62bb3182b8d128e69d4jqfm   0/1     Completed   0          74m
ghcr-io-nvidia-k8s-nim-operator-bundle-latest-main                1/1     Running     0          75m
k8s-nim-operator-77bf775c88-bscjg                                 2/2     Running     0          74m

Next Steps#

Refer to Caching Models to download and cache inference and embedding models. The sample commands show kubectl. You can use the oc or kubectl command.
You can uninstall the Operator by running operator-sdk cleanup -n nvidia-nim-operator nim-operator-certified.