Installing the NVIDIA Enterprise RAG LLM Operator

Enterprise RAG LLM Operator - (Latest Version)
  • A Kubernetes cluster and the cluster-admin role. Refer to Platform Support for information about supported operating systems and Kubernetes platforms.

  • NVIDIA A100 80 GB, H100, or L40S GPUs on one or more nodes. Refer to Platform Support for information about models and required GPU model and GPU count. For large models that exceed the memory capacity of one GPU, you need to add more GPUs. When you deploy a Helm pipeline, you can specify more than one GPU for a workload.

  • An NGC CLI API key. Pods use the API key as an image pull secret to download container images that are available to early access customers only. Refer to Generating Your NGC API Key in the NVIDIA NGC User Guide for more information.

Tanzu Kubernetes Grid Service enables the PodSecurityPolicy Admission Controller in Tanzu Kubernetes clusters. The admission controller enforces the pod security policy for pods created with a service account. The Operator uses a service account and as a result, requires labelling the namespace to prevent enforcing the policy.

Enter the following commands before installing the Operator:

Copy
Copied!
            

$ kubectl create namespace rag-operator $ kubectl label --overwrite ns rag-operator pod-security.kubernetes.io/warn=privileged pod-security.kubernetes.io/enforce=privileged

Use the NVIDIA GPU Operator to install, configure, and manage the NVIDIA GPU driver and NVIDIA container runtime on the Kubernetes node.

  1. Add the NVIDIA Helm repository:

    Copy
    Copied!
                

    $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \ && helm repo update

  2. Install the Operator:

    Copy
    Copied!
                

    $ helm install --wait --generate-name \ -n gpu-operator --create-namespace \ nvidia/gpu-operator

For more information or to adjust the configuration, refer to Install NVIDIA GPU Operator in the NVIDIA GPU Operator documentation.

  1. Add the Enterprise LLM RAG Operator repository:

    Copy
    Copied!
                

    $ helm repo add rag-operator https://helm.ngc.nvidia.com/ohlfw0olaadg/ea-participants \ --username "\$oauthtoken" --password <ngc-cli-api-key>

  2. Create the RAG Operator namespace:

    Copy
    Copied!
                

    $ kubectl create namespace rag-operator

  3. Add a Docker registry secret that the Operator uses for pulling containers from NGC:

    Copy
    Copied!
                

    $ kubectl create secret -n rag-operator docker-registry ngc-secret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ --docker-password=<ngc-cli-api-key>

  4. Install the Operator:

    Copy
    Copied!
                

    $ helm install rag-operator rag-operator/rag-operator -n rag-operator

  5. Optional: Confirm the controller pod is running:

    Copy
    Copied!
                

    $ kubectl get pods -n rag-operator

    Example Output

    Copy
    Copied!
                

    NAME READY STATUS RESTARTS AGE rag-operator k8s-rag-operator-controller-manager-6b546f57d5-g4zgg 2/2 Running 0 35h

Previous Platform Support
Next Sample RAG Pipeline
© Copyright © 2024, NVIDIA Corporation. Last updated on May 21, 2024.