Support Matrix for NeMo Retriever Library
Before you begin using NeMo Retriever Library, ensure that you have the hardware for your use case.
Note
NVIDIA Ingest (nv-ingest) has been renamed NeMo Retriever Library.
Core and Advanced Pipeline Features
The Nemo Retriever Library extraction core pipeline features run on a single A10G or better GPU. The core pipeline features include the following:
- llama-nemotron-embed-1b-v2 — Embedding model for converting text chunks into vectors.
- nemotron-page-elements-v3 — Detects and classifies images on a page as a table, chart or infographic.
- nemotron-table-structure-v1 — Detects rows, columns, and cells within a table to preserve table structure and convert to Markdown format.
- nemotron-graphic-elements-v1 — Detects graphic elements within chart images such as titles, legends, axes, and numerical values.
- nemotron-ocr-v1 — Image OCR model to detect and extract text from images.
- retrieval — Enables embedding and indexing into Milvus.
Advanced features require additional GPU support and disk space. This includes the following:
- Audio extraction - parakeet-1-1b-ctc-en-us — Use the Parakeet CTC English (en-US) ASR NIM (
nvcr.io/nim/nvidia/parakeet-1-1b-ctc-en-us) for processing audio files. For more information, refer to Audio Processing. - Advanced visual parsing — Use nemotron-parse, which adds state-of-the-art text and table extraction. For more information, refer to Advanced Visual Parsing .
-
VLM — Use nemotron-nano-12b-v2-vl for experimental image captioning of unstructured images.
Note
While nemotron-nano-12b-v2-vl is the default VLM, you can configure and use other vision language models for image captioning based on your specific use case requirements. For more information, refer to Extract Captions from Images.
-
Reranker — Use llama-3.2-nv-rerankqa-1b-v2 for improved retrieval accuracy.
Hardware Requirements
NeMo Retriever Library supports the following GPU hardware.
- RTX Pro 6000 Blackwell Server Edition
- DGX B200
- H200 NVL
- H100 Tensor Core GPU
- A100 Tensor Core GPU
- A10G Tensor Core GPU
- L40S
- RTX PRO 4500 Blackwell
The following are the hardware requirements to run NeMo Retriever Library.
| Feature | GPU Option | RTX Pro 6000 | B200 | H200 NVL | H100 | A100 80GB | A100 40GB | A10G | L40S | RTX PRO 4500 Blackwell |
|---|---|---|---|---|---|---|---|---|---|---|
| GPU | Memory | 96GB | 180GB | 141GB | 80GB | 80GB | 40GB | 24GB | 48GB | 32GB GDDR7 (GB203) |
| Core Features | Total GPUs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| Core Features | Total Disk Space | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB | ~150GB |
| Audio (parakeet-1-1b-ctc-en-us) | Additional Dedicated GPUs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1¹ |
| Audio (parakeet-1-1b-ctc-en-us) | Additional Disk Space | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB | ~37GB¹ |
| nemotron-parse | Additional Dedicated GPUs | Not supported | Not supported | Not supported | 1 | 1 | 1 | 1 | 1 | Not supported² |
| nemotron-parse | Additional Disk Space | Not supported | Not supported | Not supported | ~16GB | ~16GB | ~16GB | ~16GB | ~16GB | Not supported² |
| VLM | Additional Dedicated GPUs | 1 | 1 | 1 | 1 | 1 | Not supported | Not supported | 1 | Not supported³ |
| VLM | Additional Disk Space | ~16GB | ~16GB | ~16GB | ~16GB | ~16GB | Not supported | Not supported | ~16GB | Not supported³ |
| Reranker | With Core Pipeline | Yes | Yes | Yes | Yes | Yes | No* | No* | No* | No* |
| Reranker | Standalone (recall only) | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
¹ Audio runs but requires runtime engine build — no pre-defined model profile.
² Nemotron Parse fails to start on 32GB, pending engineering investigation.
³ VLM fails to load on 32GB, 32GB is below the minimum threshold.
* GPUs with less than 80GB VRAM cannot run the reranker concurrently with the core pipeline. To perform recall testing with the reranker on these GPUs, shut down the core pipeline NIM microservices and run only the embedder, reranker, and your vector database.