NVIDIA Enterprise RAG LLM Operator
Enterprise RAG LLM Operator - (Latest Version)

Installing the NVIDIA Enterprise RAG LLM Operator

  • A Kubernetes cluster and the cluster-admin role. Refer to Platform Support for information about supported operating systems and Kubernetes platforms.

  • A total of three NVIDIA A100 80 GB, H100, or L40S GPUs on one or more nodes. For large models that exceed the memory capacity of one GPU, you need to add more GPUs. When you deploy a Helm pipeline, you can specify more than one GPU for a workload.

  • An NGC API key. The API key is used as an image pull secret to download container images that are available to early access customers only. Refer to Generating Your NGC API Key in the NVIDIA NGC User Guide for more information.

Tanzu Kubernetes Grid Service enables the PodSecurityPolicy Admission Controller in Tanzu Kubernetes clusters. The admission controller enforces the pod security policy for pods created with a service account. The Operator uses a service account and as a result, requires labelling the namespace to prevent enforcing the policy.

Enter the following commands before installing the Operator:

Copy
Copied!
            

$ kubectl create namespace rag-operator $ kubectl label --overwrite ns rag-operator pod-security.kubernetes.io/warn=privileged pod-security.kubernetes.io/enforce=privileged

Use the NVIDIA GPU Operator to install, configure, and manage the NVIDIA GPU driver and NVIDIA container runtime on the Kubernetes node.

  1. Add the NVIDIA Helm repository:

    Copy
    Copied!
                

    $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \ && helm repo update

  2. Install the Operator:

    Copy
    Copied!
                

    $ helm install --wait --generate-name \ -n gpu-operator --create-namespace \ nvidia/gpu-operator

For more information or to adjust the configuration, refer to Install NVIDIA GPU Operator in the NVIDIA GPU Operator documentation.

  1. Add the Enterprise LLM RAG Operator repository:

    Copy
    Copied!
                

    $ helm repo add rag-operator https://helm.ngc.nvidia.com/ohlfw0olaadg/ea-rag-examples \ --username "\$oauthtoken" --password <ngc-api-key>

  2. Install the Operator:

    Copy
    Copied!
                

    $ helm install rag-operator rag-operator/rag-operator \ -n rag-operator --create-namespace --set images.registry.imagePullSecret.password=<ngc-api-key>

  3. Optional: Confirm the controller pod is running:

    Copy
    Copied!
                

    $ kubectl get pods -n rag-operator

    Example Output

    Copy
    Copied!
                

    NAME READY STATUS RESTARTS AGE rag-operator k8s-rag-operator-controller-manager-6b546f57d5-g4zgg 2/2 Running 0 35h

Previous Platform Support
Next Sample RAG Pipeline
© Copyright 2024, NVIDIA. Last updated on Mar 21, 2024.