NVIDIA RAG Blueprint Documentation#

Welcome to the NVIDIA RAG Blueprint documentation. You can learn more here, including how to get started with the RAG Blueprint, how to customize the RAG Blueprint, and how to troubleshoot the RAG Blueprint.

Release Notes#

For the release notes, refer to Release Notes.

Support Matrix#

For hardware requirements and other information, refer to the Support Matrix.

Get Started With RAG Blueprint#

  • Use the procedures in Get Started to get started quickly with the NVIDIA RAG Blueprint.

  • Experiment and test in the Web User Interface.

  • Use the Python Package to interact with the RAG system directly from Python code.

  • Explore the notebooks that demonstrate how to use the APIs. For details refer to Notebooks.

Deployment Options for RAG Blueprint#

You can deploy the RAG Blueprint with Docker, Helm, or NIM Operator, and target dedicated hardware or a Kubernetes cluster. Use the following documentation to deploy the blueprint.

Important

Before you deploy, consider the following:

  • Self-hosted deployments require ~200GB of free disk space for model downloads and caching.

  • First-time deployments take 15-30 minutes (Docker) or 60-70 minutes (Kubernetes) as large models are downloaded.

  • Model downloads do not show progress bars; see the deployment guides for monitoring commands.

  • Subsequent deployments are much faster (2-15 minutes) because models are already cached.

For detailed requirements, refer to Support Matrix.

Alternative Deployment Options:

Developer Guide#

After you deploy the RAG blueprint, you can customize it for your use cases.

Troubleshoot RAG Blueprint#

Reference#

Blog Posts#