Skip to main content
Ctrl+K
 NVIDIA-RAG-blueprint - Home  NVIDIA-RAG-blueprint - Home

NVIDIA-RAG-blueprint

  • GitHub
 NVIDIA-RAG-blueprint - Home  NVIDIA-RAG-blueprint - Home

NVIDIA-RAG-blueprint

  • GitHub

Table of Contents

NVIDIA RAG Blueprint

  • Overview
  • Release Notes
  • Support Matrix

Get Started

  • Get an API Key
  • Deploy with Docker (Self-Hosted Models)
  • Web User Interface
  • Notebooks

Deployment Options for RAG Blueprint

  • Deploy with Docker (NVIDIA-Hosted Models)
  • Deploy on Kubernetes with Helm
  • Deploy on Kubernetes with Helm from the repository
  • Deploy on Kubernetes with Helm and MIG Support
  • Deploy on Kubernetes with NIM Operator

Common configurations

  • Best Practices for Common Settings
  • Change the LLM or Embedding Model
  • Customize LLM Parameters at Runtime
  • Customize Prompts
  • Model Profiles for Hardware Configurations
  • Multi-Collection Retrieval
  • Multi-Turn Conversation Support
  • Query rewriting to improve the accuracy of multi-turn conversations
  • Reasoning in Nemotron LLM model
  • Self-reflection to improve accuracy
  • Summarization

Data Ingestion & Processing

  • Audio Ingestion Support
  • Custom metadata Support
  • File System Access to Extraction Results
  • Multimodal Embedding Support (Early Access)
  • NeMo Retriever OCR for Enhanced Text Extraction (Early Access)
  • PDF Extraction with Nemoretriever Parse
  • Enable Text-Only Ingestion Support in Docker for NVIDIA RAG Blueprint
  • Deploy NV-Ingest Standalone

Vector Database and Retrieval

  • Configure Elasticsearch as Your Vector Database for NVIDIA RAG Blueprint
  • Enable Hybrid Search Support for NVIDIA RAG Blueprint
  • Milvus Configuration
  • Query Decomposition

Multimodal and Advanced Generation

  • Image captioning support for ingested documents
  • VLM based inferencing in RAG

Governance

  • NeMo Guardrails for input/output

Observability and Telemetry

  • Observability

Troubleshoot RAG Blueprint

  • Troubleshoot
  • RAG Pipeline Debugging Guide
  • Migrate from a Previous Version

Reference

  • Use the Python Package
  • Milvus Collection Schema Requirements
  • API - Ingestor Server Schema
  • API - RAG Server Schema
  • API - RAG Server Schema

API - RAG Server Schema#

This documentation contains the OpenAPI reference for the RAG server.

Tip

To view this documentation on docs.nvidia.com, browse to https://docs.nvidia.com/rag/latest/api-rag.

Related Topics#

  • API - Ingestor Server Schema

  • NVIDIA RAG Blueprint Documentation

previous

API - Ingestor Server Schema

On this page
  • Related Topics
NVIDIA NVIDIA
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2025, NVIDIA Corporation.