Minimum System Requirements for NVIDIA RAG Blueprint#

This documentation contains the system requirements for the NVIDIA RAG Blueprint.

Important

You can deploy the RAG Blueprint with Docker, Helm, or NIM Operator, and target dedicated hardware or a Kubernetes cluster. Some requirements are different depending on your target system and deployment method.

Operating System#

For the RAG Blueprint you need the following operating system:

  • Ubuntu 22.04 OS

Driver Versions#

For the RAG Blueprint you need the following drivers:

  • GPU Driver - 560 or later

  • CUDA version - 12.9 or later

For details, see NVIDIA NIM for LLMs Software.

Hardware Requirements (Docker)#

By default, the RAG Blueprint deploys the NIM microservices locally (self-hosted). You need one of the following:

  • 2 x H100

  • 3 x B200

  • 3 x A100 SXM

  • 2 x RTX PRO 6000

Tip

You can also modify the RAG Blueprint to use NVIDIA-hosted NIM microservices.

Hardware Requirements (Kubernetes)#

To install the RAG Blueprint on Kubernetes, you need one of the following:

Hardware requirements for self-hosting all NVIDIA NIM microservices#

The following are requirements and recommendations for the individual components of the RAG Bluprint:

  • Pipeline operation – 1x L40 GPU or similar recommended. This is needed for the Milvus vector database, as GPU acceleration is enabled by default.

  • LLM NIM (llama-3.3-nemotron-super-49b-v1.5) – Refer to the Support Matrix.

  • Embedding NIM (Llama-3.2-NV-EmbedQA-1B-v2 ) – Refer to the Support Matrix.

  • Reranking NIM (llama-3_2-nv-rerankqa-1b-v2 ): Refer to the Support Matrix.

  • NVIDIA NIM for Image OCR (baidu/paddleocr): Refer to the Support Matrix.

  • NeMo Retriever OCR: Refer to the Support Matrix.

  • NVIDIA NIMs for Object Detection:

Tip

To switch between Paddle OCR and NeMo Retriever OCR, see NeMo Retriever OCR.