Context-Aware RAG#

The Context-Aware RAG in VSS is responsible for the video search and summarization based on the dense captions generated from the data processing pipeline. It implements following data pipelines to achieve it:
Data processing
Data ingestion
Data retrieval
All the data pipelines are implemented for:
Summarization
Q&A
Alerts
Summarization#
CA-RAG provides the following methods for summarizing content. By default, the batch
summarization method is enabled.
Refine: This recursive approach generates summaries incrementally by incorporating new captions into the previous summary. It is recommended for processing streaming captions.
Batch: This method performs summarization in two stages:
Batching: Divides content into smaller chunks (defined by
batch_size
) and generates summaries for each chunk.Aggregation: Combines batch summaries using a secondary prompt (
summary_aggregation
).
This method is ideal for handling large videos.
Question and Answer#
CA-RAG supports Question-Answering (QnA) functionality via VectorRAG
and GraphRAG
. VectorRAG
is the only supported method for live stream processing. GraphRAG
is specifically designed for video-based queries. GraphRAG
or VectorRAG
can be configured through CA-RAG Configuration.
VectorRAG for Live Streaming
Captions generated by the Vision-Language Model (VLM), along with their embeddings, are stored in Milvus DB.
Embeddings are created using nvidia/nv-embedqa-e5-v5.
For a query, the top five most similar chunks are retrieved, re-ranked using nvidia/nv-rerankqa-mistral-4b-v3, and passed to a Large Language Model (LLM) to generate the final answer.
GraphRAG for Videos
Graph Extraction: Entities and relationships are extracted from VLM captions, using an LLM, and stored in a GraphDB. Captions and embeddings, generated with
nvidia/nv-embedqa-e5-v5
, are also linked to these entities.Graph Retrieval: For a given query, relevant entities, relationships, and captions are retrieved from the GraphDB and passed to an LLM to generate the final answer.
Alerts#
The Alerts feature allows event-based notifications to be configured in config/config.yaml
. For each VLM caption, an LLM analyzes and generates alerts based on the defined event criteria.