Installing NVIDIA NIM Operator on VMware vSphere With Tanzu

Prerequisites

  • A TKG cluster and the cluster-admin role. Refer to Platform Support for information about supported operating systems and Kubernetes platforms.

  • A persistent volume provisioner that supports network access, such as vSAN.

  • Kubernetes CLI tools for VMware vSphere. Refer to Download and Install the Kubernetes CLI Tools for vSphere in the VMware vSphere documentation for more information.

  • NVIDIA A100 80 GB, H100, or L40S GPUs on one or more nodes. Refer to Platform Support for information about models and required GPU model and GPU count. For large models that exceed the memory capacity of one GPU, you need to add more GPUs. When you deploy a pipeline, you can specify more than one GPU for a workload.

  • An NGC CLI API key. Pods use the API key as an image pull secret to download container images and models from NVIDIA NGC. Refer to Generating Your NGC API Key in the NVIDIA NGC User Guide for more information.

  • An active subscription to an NVIDIA AI Enterprise product or be an NVIDIA Developer Program member. Access to the containers and models for NVIDIA NIM microservices is restricted.

Installing GPU Operator

Use the NVIDIA GPU Operator to install, configure, and manage the NVIDIA GPU driver and NVIDIA container runtime on the Kubernetes nodes.

  1. Create and label the Operator namespace to prevent the admission controller from enforcing the pod security policy for pods that are created with a service account:

    $ kubectl create namespace gpu-operator
    $ kubectl label --overwrite ns gpu-operator pod-security.kubernetes.io/warn=privileged pod-security.kubernetes.io/enforce=privileged
    
  2. Add the Helm repository for NVIDIA:

    $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
       && helm repo update
    
  3. Install the Operator:

    $ helm install --wait --generate-name \
       -n gpu-operator \
       nvidia/gpu-operator
    

For more information or to adjust the configuration, refer to Installing the NVIDIA GPU Operator in the NVIDIA GPU Operator documentation.

Install NIM Operator

  1. Create the Operator namespace:

    $ kubectl create namespace nim-operator
    
  2. Add a Docker registry secret that the Operator uses for pulling containers and models from NGC:

    $ kubectl create secret -n nim-operator docker-registry ngc-secret \
        --docker-server=nvcr.io \
        --docker-username='$oauthtoken' \
        --docker-password=<ngc-api-key>
    
  3. Install the Operator:

    $ helm install nim-operator nvidia/k8s-nim-operator -n nim-operator
    
  4. Optional: Confirm the controller pod is running:

    $ kubectl get pods -n nim-operator
    

    Example Output

    NAME                                            READY   STATUS    RESTARTS      AGE
    nim-operator-k8s-nim-operator-6b546f57d5-g4zgg  2/2     Running     0           35h
    

Next Steps

  • Refer to Caching Models to download and cache inference and embedding models.