Notebooks for NVIDIA RAG Blueprint#

This section contains Jupyter notebooks that demonstrate how to use the NVIDIA RAG Blueprint APIs and advanced development features.

Set Up the Notebook Environment#

To run a notebook in a Python virtual environment, use the following procedure.

Create and activate a virtual environment.

python3 -m virtualenv venv
source venv/bin/activate

Ensure that you have JupyterLab and required dependencies installed.
```
pip3 install jupyterlab
```

Run the following command to start JupyterLab and allow access from any IP.

jupyter lab --allow-root --ip=0.0.0.0 --NotebookApp.token='' --port=8889 --no-browser

Set-up Notes#

Ensure that API keys and credentials are correctly set up before you run a notebook.
Modify endpoints or request parameters as necessary to match your specific use case.
For the custom VDB operator notebook, ensure that Docker is available for running OpenSearch services.

Run a Notebook#

After your notebook environment is set up, follow these steps to run a notebook.

Access JupyterLab by opening a browser and navigating to http://<your-server-ip>:8889.
Navigate to the notebook and run the cells sequentially.

Beginner Notebooks#

Start with the following notebooks to learn basic API interactions.

ingestion_api_usage.ipynb – Demonstrates how to interact with the NVIDIA RAG ingestion service, including how to upload and process documents for retrieval-augmented generation (RAG).
retriever_api_usage.ipynb – Demonstrates how to use the NVIDIA RAG retriever service, including different query techniques and retrieval strategies.
image_input.ipynb – Demonstrates multimodal query support, enabling you to query documents using both text and images. Covers VLM embeddings, image extraction, and the search/generate APIs with visual queries.

Experiment with the Ingestion API Usage Notebook#

After the RAG Blueprint is deployed, you can use the Ingestion API Usage notebook to start experimenting with it.

Download and install Git LFS by following the installation instructions.
Initialize Git LFS in your environment.
```
git lfs install
```
Pull the dataset into the current repo.
```
git lfs pull
```
Install jupyterlab.
```
pip install jupyterlab
```
Use this command to run Jupyter Lab so that you can execute this IPython notebook.
```
jupyter lab --allow-root --ip=0.0.0.0 --NotebookApp.token='' --port=8889
```
Run the ingestion_api_usage notebook. Follow the cells in the notebook to ingest the PDF files from the data/dataset folder into the vector store.

Intermediate Notebooks#

Use the following notebooks to learn comprehensive Python client usage, metadata, summarization, and other features.

summarization.ipynb – Demonstrates document summarization customization including page filtering, fast shallow extraction, and multiple summarization strategies. Covers both Library Mode and Docker Mode API usage.
evaluation_01_ragas.ipynb – Evaluate your RAG system using three key metrics with the Ragas library.
evaluation_02_recall.ipynb – Evaluate retrieval performance using the recall metric, which measures the fraction of relevant documents successfully retrieved at various top-k thresholds.
nb_metadata.ipynb – Demonstrates metadata features including metadata ingestion, filtering, and extraction. Includes step-by-step examples of how to use metadata for enhanced document retrieval and Q&A capabilities. This notebook is for users who want to implement sophisticated metadata-based filtering in their RAG applications.
rag_library_usage.ipynb – Demonstrates native usage of the NVIDIA RAG Python client, including environment setup, document ingestion, collection management, and querying. This notebook provides end-to-end API usage examples for interacting directly with the RAG system from Python, covering both ingestion and retrieval workflows.
rag_library_lite_usage.ipynb – Demonstrates containerless deployment of the NVIDIA RAG Python package in lite mode. Uses Milvus Lite (embedded vector database) and NeMo Retriever Library subprocess mode for a simplified setup without Docker containers. Leverages NVIDIA cloud APIs for embeddings, ranking, and LLM inference. Note: This mode does not support image/table/chart citations or document summarization.
langchain_nvidia_retriever.ipynb – Showcases LangChain integration with the NVIDIA RAG Blueprint. Run ingestion_api_usage.ipynb first to ingest documents, then use NVIDIARAGRetriever for retrieval (sync/async), custom parameters, error handling, and optional RAG chaining with ChatNVIDIA.

Advanced Notebooks#

Use the following notebooks to learn how to how to extend the system with custom vector database implementations.

building_rag_vdb_operator.ipynb – Demonstrates how to create and integrate custom vector database (VDB) operators with the NVIDIA RAG blueprint. This notebook builds a complete OpenSearch VDB operator from scratch by using the VDBRag base class architecture. This notebook is for developers who want to extend NVIDIA RAG with their own vector database implementations.
mcp_server_usage.ipynb – Demonstrates how to use the NVIDIA RAG MCP server via MCP transports (SSE, streamable-http, and stdio) instead of REST APIs. Covers launching the server, connecting with the MCP Python client, listing tools, and calling Ingestor tools (create_collection, list_collections, upload_documents, get_documents, update_documents, delete_documents, update_collection_metadata, update_document_metadata, delete_collections) and RAG tools (generate, search, get_summary).
nat_mcp_integration.ipynb – Demonstrates integration of NeMo Agent Toolkit (NAT) with the NVIDIA RAG MCP server. Shows how to build intelligent agents that leverage enterprise knowledge through RAG, configure NAT workflows using YAML, and enable natural language interactions with document collections. Covers end-to-end setup including MCP server configuration, collection management, and agent-based querying.
image_input.ipynb – Demonstrates how to use the NVIDIA RAG retriever APIs with multimodal queries (text + images). Covers deploying VLM and multimodal embedding NIMs, ingesting documents with image extraction, and using the search and generate APIs with visual inputs for use cases like querying product catalogs with images.

Deployment Notebooks#

Use the following notebook for cloud deployment scenarios.

launchable.ipynb – A deployment-ready notebook intended to run in a Brev environment. To learn more about Brev, refer to Brev. Follow the instructions for running Jupyter notebooks in a cloud-based environment based on the hardware requirements specified in the launchable.