Minimum System Requirements for NVIDIA RAG Blueprint#
This documentation contains the system requirements for the NVIDIA RAG Blueprint.
Important
You can deploy the RAG Blueprint with Docker, Helm, or NIM Operator, and target dedicated hardware or a Kubernetes cluster. Some requirements are different depending on your target system and deployment method.
Operating System#
For the RAG Blueprint you need the following operating system:
Ubuntu 22.04 OS
Driver Versions#
For the RAG Blueprint you need the following drivers:
GPU Driver - 560 or later
CUDA version - 12.9 or later
For details, see NVIDIA NIM for LLMs Software.
Hardware Requirements (Docker)#
By default, the RAG Blueprint deploys the NIM microservices locally (self-hosted). You need one of the following:
2 x H100
3 x B200
3 x A100 SXM
2 x RTX PRO 6000
Tip
You can also modify the RAG Blueprint to use NVIDIA-hosted NIM microservices.
Hardware Requirements (Kubernetes)#
To install the RAG Blueprint on Kubernetes, you need one of the following:
8 x H100-80GB
9 x B200
9 x A100-80GB SXM
8 x RTX PRO 6000
3 x H100 (with Multi-Instance GPU / DRA with NIM Operator)
Hardware requirements for self-hosting all NVIDIA NIM microservices#
The following are requirements and recommendations for the individual components of the RAG Bluprint:
Pipeline operation – 1x L40 GPU or similar recommended. This is needed for the Milvus vector database, as GPU acceleration is enabled by default.
LLM NIM (llama-3.3-nemotron-super-49b-v1.5) – Refer to the Support Matrix.
Embedding NIM (Llama-3.2-NV-EmbedQA-1B-v2 ) – Refer to the Support Matrix.
Reranking NIM (llama-3_2-nv-rerankqa-1b-v2 ): Refer to the Support Matrix.
NVIDIA NIM for Image OCR (baidu/paddleocr): Refer to the Support Matrix.
NeMo Retriever OCR: Refer to the Support Matrix.
NVIDIA NIMs for Object Detection:
NeMo Retriever Page Elements v2 Support Matrix
NeMo Retriever Graphic Elements v1 Support Matrix
NeMo Retriever Table Structure v1 Support Matrix
Tip
To switch between Paddle OCR and NeMo Retriever OCR, see NeMo Retriever OCR.