Deploy NVIDIA RAG Blueprint on Kubernetes with Helm from the repository#

Use the following documentation to deploy the NVIDIA RAG Blueprint by using the helm chart from the repository.

To deploy the Helm chart with MIG support, refer to RAG Deployment with MIG Support.
To deploy with Helm from the repository, refer to Deploy Helm from the repository.
For other deployment options, refer to Deployment Options.

The following are the core services that you install:

RAG server
Ingestor server
NV-Ingest

Prerequisites#

Verify that you meet the prerequisites specified in prerequisites.
Clone the RAG Blueprint Git repository to get access to the Helm chart source files.

Verify that you have installed the NVIDIA NIM Operator. If not, install it by running the following code:

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
  --username='$oauthtoken' \
  --password=$NGC_API_KEY
helm repo update
helm install nim-operator nvidia/k8s-nim-operator -n nim-operator --create-namespace

For more details, see instructions here.

Important

Consider the following before you deploy the RAG Blueprint:

Ensure that you have at least 200GB of available disk space per node for NIM model caches
First-time deployment takes 60-70 minutes as models download without visible progress indicators

For monitoring commands, refer to Deploy on Kubernetes with Helm - Prerequisites.

Deploy the RAG Helm chart from the repository#

If you are working directly with the source Helm chart, and you want to customize components individually, use the following procedure.

Change directory to deploy/helm/ by running the following code.
```
cd deploy/helm/
```
Create a namespace for the deployment by running the following code.
```
kubectl create namespace rag
```

Configure Helm repo additions by editing and then running the following code.

helm repo add nvidia-nim https://helm.ngc.nvidia.com/nim/nvidia/ --username='$oauthtoken' --password=$NGC_API_KEY
helm repo add nim https://helm.ngc.nvidia.com/nim/ --username='$oauthtoken' --password=$NGC_API_KEY
helm repo add nemo-microservices https://helm.ngc.nvidia.com/nvidia/nemo-microservices --username='$oauthtoken' --password=$NGC_API_KEY
helm repo add baidu-nim https://helm.ngc.nvidia.com/nim/baidu --username='$oauthtoken' --password=$NGC_API_KEY
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add elastic https://helm.elastic.co
helm repo add otel https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo add zipkin https://zipkin.io/zipkin-helm
helm repo add prometheus https://prometheus-community.github.io/helm-charts

Update Helm chart dependencies by running the following code.
```
helm dependency update nvidia-blueprint-rag
```

Install the chart by running the following code.

helm upgrade --install rag -n rag nvidia-blueprint-rag/ \
--set imagePullSecret.password=$NGC_API_KEY \
--set ngcApiSecret.password=$NGC_API_KEY

Important

For NVIDIA RTX6000 Pro Deployments:

If you are deploying on NVIDIA RTX6000 Pro GPUs (instead of H100 GPUs), you need to configure the NIM LLM model profile. The required configuration is already present but commented out in the values.yaml file.

Uncomment and modify the following section under nimOperator.nim-llm.model in the values.yaml:

model:
  engine: tensorrt_llm
  precision: "fp8"
  qosProfile: "throughput"
  tensorParallelism: "1"
  gpus:
    - product: "rtx6000_blackwell_sv"

Then install using the modified values.yaml:

helm upgrade --install rag -n rag nvidia-blueprint-rag/ \
  --set imagePullSecret.password=$NGC_API_KEY \
  --set ngcApiSecret.password=$NGC_API_KEY \
  -f nvidia-blueprint-rag/values.yaml

Note

Refer to NIM Model Profile Configuration for using non-default NIM LLM profile.

Follow the remaining instructions in Deploy on Kubernetes with Helm:

Deploy NVIDIA RAG Blueprint on Kubernetes with Helm from the repository#

Prerequisites#

Deploy the RAG Helm chart from the repository#

Related Topics#