Minimum System Requirements for NVIDIA RAG Blueprint#

This documentation contains the system requirements for the NVIDIA RAG Blueprint.

Important

You can deploy the RAG Blueprint with Docker, Helm, or NIM Operator, and target dedicated hardware or a Kubernetes cluster. Some requirements are different depending on your target system and deployment method.

Disk Space Requirements#

Important

Ensure that you have at least 200GB of available disk space before you deploy the RAG Blueprint. This space is required for the following:

  • NIM model downloads and caching (largest component, ~100-150GB)

  • Container images (~20-30GB)

  • Vector database data and indices

  • Application logs and temporary files

Insufficient disk space causes deployment failures during model downloads or runtime operations.

Operating System#

For the RAG Blueprint you need the following operating system:

  • Ubuntu 22.04 OS

Driver Versions#

For the RAG Blueprint you need the following drivers:

  • GPU Driver - 560 or later

  • CUDA version - 12.9 or later

For details, see NVIDIA NIM for LLMs Software.

Hardware Requirements (Docker)#

By default, the RAG Blueprint deploys the NIM microservices locally (self-hosted). You need one of the following:

  • 2 x H100

  • 2 x B200

  • 3 x A100 SXM

  • 2 x RTX PRO 6000

Tip

You can also modify the RAG Blueprint to use NVIDIA-hosted NIM microservices.

Tip

No GPU Available? Try the Containerless Deployment (Lite Mode) which requires no GPU hardware and uses NVIDIA cloud APIs for all processing.

Hardware Requirements (Kubernetes)#

To install the RAG Blueprint on Kubernetes, you need one of the following:

  • 8 x H100-80GB

  • 8 x B200

  • 9 x A100-80GB SXM

  • 8 x RTX PRO 6000

  • 3 x H100 (with Multi-Instance GPU)

Hardware requirements for self-hosting all NVIDIA NIM microservices#

The following are requirements and recommendations for the individual components of the RAG Blueprint:

  • Pipeline operation – 1x L40 GPU or similar recommended. This is needed for the Milvus vector database, as GPU acceleration is enabled by default.

  • LLM NIM (llama-3.3-nemotron-super-49b-v1.5) – Refer to the Support Matrix.

  • Embedding NIM (Llama-3.2-NV-EmbedQA-1B-v2 ) – Refer to the Support Matrix.

  • Reranking NIM (llama-3_2-nv-rerankqa-1b-v2 ): Refer to the Support Matrix.

  • NeMo Retriever OCR (Default): Refer to the Support Matrix.

  • NVIDIA NIM for Image OCR (baidu/paddleocr - Legacy): Refer to the Support Matrix.

  • NVIDIA NIMs for Object Detection:

Tip

NeMo Retriever OCR is now the default OCR service. To use the legacy Paddle OCR instead, see OCR Configuration Guide.