Support Matrix for NeMo Retriever Extraction
Before you begin using NeMo Retriever extraction, ensure that you have the hardware for your use case.
Hardware
GPU | Family | Memory | Minimum GPUs |
---|---|---|---|
H100 | SXM or PCIe | 80GB | 1 |
A100 | SXM or PCIe | 80GB | 1 |
A10G | — | 24GB | 1 |
L40S | — | 48GB | 1 |
The core pipeline requires approximately 150GB disk space. To run the core pipeline and all optional features, you need approximately 210GB disk space.
Advanced Feature Support
Some advanced features, such as VLM integrations and audio extraction, require additional GPU support and disk space. For more information, refer to Profile Information.
VLM Integrations
- nemoretriever-parse VLM NIM — NeMo Retriever is compatible with nemoretriever-parse VLM NIM, which adds state-of-the-art text and table extraction. To integrate this NIM into the nv-ingest pipeline, you need 1 additional GPU (H100, A100, A10G, L40S). Nemo Retriever parse requires ~16GB additional disk space.
- Llama3.2 Vision VLM NIMs — NeMo Retriever is compatible with the Llama3.2 VLM NIMs for image captioning capabilities. To integrate these NIM into the nv-ingest pipeline, you need 1 additional GPU (H100, A100, A10G, L40S). Image captioning requires ~16GB additional disk space.
Audio Extraction (Early Access)
- RIVA NIM — NeMo Retriever can retrieve across audio files by using the RIVA NIMs. To integrate this capability into the nv-ingest pipeline, you need 1 additional GPU (H100, A100, A10G, L40S). Audio extraction requires ~37GB additional disk space.