Skip to content

Overview of NVIDIA NeMo Retriever

NVIDIA NeMo Retriever is a collection of microservices for building and scaling multimodal data extraction, embedding, and reranking pipelines with high accuracy and maximum data privacy – built with NVIDIA NIM.

NeMo Retriever provides the following:

  • Multimodal Data Extraction — Quickly extract documents at scale that include text, tables, charts, and infographics.
  • Embedding + Indexing — Embed all extracted text from text chunks and images, and then insert into Milvus - accelerated with NVIDIA cuVS.
  • Retrieval — Leverage semantic + hybrid search for high accuracy retrieval with the embedding + reranking NIM microservice.

Overview diagram

Enterprise-Ready Features

NVIDIA NeMo Retriever comes with enterprise-ready features, including the following:

  • High Accuracy — NeMo Retriever exhibits a high level of accuracy when retrieving across various modalities through enterprise documents.
  • High Throughput — NeMo Retriever is capable of extracting, embedding, indexing and retrieving across hundreds of thousands of documents at scale with high throughput.
  • Decomposable/Customizable — NeMo Retriever consists of modules that can be separately used and deployed in your own environment.
  • Enterprise-Grade Security — NeMo Retriever NIMs come with security features such as the use of safetensors, continuous patching of CVEs, and more.

Applications

The following are some applications that use NVIDIA Nemo Retriever: