Release Notes for NeMo Retriever Library
This documentation contains the release notes for NeMo Retriever Library.
26.05 Release Notes (26.5.0)
NVIDIA® NeMo Retriever Library version 26.05 builds on the 26.03 foundation with a graph-based ingest architecture, expanded multimodal and tabular capabilities, production-oriented service deployment, and documentation aligned to a Helm-first supported path.
To upgrade the Helm charts for this release, refer to the NeMo Retriever Library Helm Charts.
Highlights for the 26.05 release include:
Upgrade notes
- Text splitting for graph and library ingest moved into
.extract(split_config=...)instead of standalone.split()on the graph ingest path (the service ingestor API may still expose.split()separately) - Direct
Retriever(...)construction usesvdb_kwargs,embed_kwargs, andrerankinstead of flatlancedb_uri,lancedb_table,embedder,embedding_endpoint,local_query_embed_backend, andrerankerarguments - For Helm audio and video extraction, set
service.installFfmpeg: trueinvalues.yaml(or pass--set service.installFfmpeg=true) when images no longer bundleffmpegandffprobeby default nemo_retrieverrequires Python 3.12
Pipeline and ingestion
- Legacy
nv-ingestcode paths removed;graph_pipelineand the graph stage registry are the canonical ingestion path - Manifest-based ingest routing replaces input-type routing;
retriever ingestis input-aware for PDF, image, audio, video, text, HTML, DOCX/PPTX, SVG, and related types allow_no_gpuoption to skip GPU requirement during ingest for CPU-only experimentation
CLI
- Root CLI adds
retriever ingestandretriever querywith NIM URL flags, batch tuning, and LanceDB overwrite/append controls, plusretriever pipelinefor graph execution - For product use, only
retriever ingest,retriever query, andretriever pipeline(for exampleretriever pipeline run) are supported; other top-level subcommands—includingpdf,html,eval,benchmark,harness,online,compare,image, andskill-eval—are development and experimental
Retriever Service and deployment
- Retriever Service v2 adds a scalable multi-pod architecture with gateway, process isolation, and VectorDB integration
- OpenTelemetry basic support for pipeline and service observability
- Expanded air-gapped deployment guidance in deployment options and the Helm chart README
Models, OCR, and captioning
- Nemotron OCR v2 is the default OCR engine for HuggingFace, with CLI language selectors and unified OCR actors. For Helm NIM deployments, Nemotron OCR v1 is the default.
- Nemotron Parse is available as an alternate PDF extraction method (v1.2 HTTP interface; optional Helm NIM; local inference via vLLM where configured)
- VLM image captioning via vLLM (including Omni caption model profiles) addresses the capability deferred in 26.03
- vLLM-backed text and vision-language embedders, multimodal VL reranker, and torch 2.11 for local GPU installs
Multimodal extraction
- Video retrieval pipeline with frame extraction, OCR, audio-visual fusion, and text deduplication
- Long-audio Parakeet chunking with time-aligned segments; punctuation-based audio segmenting; ASR batch/streaming improvements
Retrieval and RAG
- Live RAG SDK with
Retriever.retrieve(), reference answer generationRetriever.answer(), and optional batch operator graphs via LiteLLM ([llm]extra)
Vector database
- Vector database operators integrated directly in the pipeline; custom metadata support; LanceDB hybrid search guidance updated
- LanceDB is documented as the first-party vector path for new deployments; Milvus/MinIO guidance removed from the primary extraction doc set
Evaluation
-
BEIR-centric evaluation overhaul and
retriever skill-evalbenchmark CLI for the NeMo Retriever skill (experimental) -
Text-to-SQL agent graph and tabular tooling for structured data retrieval, including tabular data ingestion
Packaging and platform
- Optional install extras (
[local],[multimedia],[llm],[tabular],[nemotron-parse],[service], and others), including slim remote/NIM-only installs on Mac and Windows
Helm chart
- Helm chart refresh under
nemo_retriever/helm/with GA VL embedder defaults and optional Nemotron Parse and Omni caption NIMs
Documentation
- Documentation aligned to a Helm-first supported path; Docker Compose for local development documented as unsupported developer tooling (not a production NIM deployment path)
- Documentation consolidates extraction concepts, ingest workflow, embeddings, audio/video guides, prerequisites and support matrix, and UDF/custom stages in the graph README
Release Notes for Previous Versions
| 26.03 | 26.1.2 | 26.1.1 | 25.9.0 | 25.6.3 | 25.6.2 | 25.4.2 | 25.3.0 | 24.12.1 | 24.12.0