Overview of NVIDIA NeMo Retriever
NVIDIA NeMo Retriever is a collection of microservices for building and scaling multimodal data extraction, embedding, and reranking pipelines with high accuracy and maximum data privacy – built with NVIDIA NIM.
NeMo Retriever provides the following:
- Multimodal Data Extraction — Quickly extract documents at scale that include text, tables, charts, and infographics.
- Embedding + Indexing — Embed all extracted text from text chunks and images, and then insert into Milvus - accelerated with NVIDIA cuVS.
- Retrieval — Leverage semantic + hybrid search for high accuracy retrieval with the embedding + reranking NIM microservice.
Enterprise-Ready Features
NVIDIA NeMo Retriever comes with enterprise-ready features, including the following:
- High Accuracy — NeMo Retriever exhibits a high level of accuracy when retrieving across various modalities through enterprise documents.
- High Throughput — NeMo Retriever is capable of extracting, embedding, indexing and retrieving across hundreds of thousands of documents at scale with high throughput.
- Decomposable/Customizable — NeMo Retriever consists of modules that can be separately used and deployed in your own environment.
- Enterprise-Grade Security — NeMo Retriever NIMs come with security features such as the use of safetensors, continuous patching of CVEs, and more.
Applications
The following are some applications that use NVIDIA Nemo Retriever:
- Document Research Assistant for Blog Creation (LlamaIndex Jupyter Notebook)
- Digital Human for Customer Service (NVIDIA AI Blueprint)
- AI Virtual Assistant for Customer Service (NVIDIA AI Blueprint)
- Building Code Documentation Agents with CrewAI (CrewAI Demo)
- Video Search and Summarization (NVIDIA AI Blueprint)