Skip to content

Release Notes for NeMo Retriever Library

This documentation contains the release notes for NeMo Retriever Library.

26.05 Release Notes (26.5.0)

NVIDIA® NeMo Retriever Library version 26.05 builds on the 26.03 foundation with a graph-based ingest architecture, expanded multimodal and tabular capabilities, production-oriented service deployment, and documentation aligned to a Helm-first supported path.

To upgrade the Helm charts for this release, refer to the NeMo Retriever Library Helm Charts.

Highlights for the 26.05 release include:

Upgrade notes

  • Text splitting for graph and library ingest moved into .extract(split_config=...) instead of standalone .split() on the graph ingest path (the service ingestor API may still expose .split() separately)
  • Direct Retriever(...) construction uses vdb_kwargs, embed_kwargs, and rerank instead of flat lancedb_uri, lancedb_table, embedder, embedding_endpoint, local_query_embed_backend, and reranker arguments
  • For Helm audio and video extraction, set service.installFfmpeg: true in values.yaml (or pass --set service.installFfmpeg=true) when images no longer bundle ffmpeg and ffprobe by default
  • nemo_retriever requires Python 3.12

Pipeline and ingestion

  • Legacy nv-ingest code paths removed; graph_pipeline and the graph stage registry are the canonical ingestion path
  • Manifest-based ingest routing replaces input-type routing; retriever ingest is input-aware for PDF, image, audio, video, text, HTML, DOCX/PPTX, SVG, and related types
  • allow_no_gpu option to skip GPU requirement during ingest for CPU-only experimentation

CLI

  • Root CLI adds retriever ingest and retriever query with NIM URL flags, batch tuning, and LanceDB overwrite/append controls, plus retriever pipeline for graph execution
  • For product use, only retriever ingest, retriever query, and retriever pipeline (for example retriever pipeline run) are supported; other top-level subcommands—including pdf, html, eval, benchmark, harness, online, compare, image, and skill-eval—are development and experimental

Retriever Service and deployment

  • Retriever Service v2 adds a scalable multi-pod architecture with gateway, process isolation, and VectorDB integration
  • OpenTelemetry basic support for pipeline and service observability
  • Expanded air-gapped deployment guidance in deployment options and the Helm chart README

Models, OCR, and captioning

  • Nemotron OCR v2 is the default OCR engine for HuggingFace, with CLI language selectors and unified OCR actors. For Helm NIM deployments, Nemotron OCR v1 is the default.
  • Nemotron Parse is available as an alternate PDF extraction method (v1.2 HTTP interface; optional Helm NIM; local inference via vLLM where configured)
  • VLM image captioning via vLLM (including Omni caption model profiles) addresses the capability deferred in 26.03
  • vLLM-backed text and vision-language embedders, multimodal VL reranker, and torch 2.11 for local GPU installs

Multimodal extraction

  • Video retrieval pipeline with frame extraction, OCR, audio-visual fusion, and text deduplication
  • Long-audio Parakeet chunking with time-aligned segments; punctuation-based audio segmenting; ASR batch/streaming improvements

Retrieval and RAG

  • Live RAG SDK with Retriever.retrieve(), reference answer generation Retriever.answer(), and optional batch operator graphs via LiteLLM ([llm] extra)

Vector database

  • Vector database operators integrated directly in the pipeline; custom metadata support; LanceDB hybrid search guidance updated
  • LanceDB is documented as the first-party vector path for new deployments; Milvus/MinIO guidance removed from the primary extraction doc set

Evaluation

  • BEIR-centric evaluation overhaul and retriever skill-eval benchmark CLI for the NeMo Retriever skill (experimental)

  • Text-to-SQL agent graph and tabular tooling for structured data retrieval, including tabular data ingestion

Packaging and platform

  • Optional install extras ([local], [multimedia], [llm], [tabular], [nemotron-parse], [service], and others), including slim remote/NIM-only installs on Mac and Windows

Helm chart

  • Helm chart refresh under nemo_retriever/helm/ with GA VL embedder defaults and optional Nemotron Parse and Omni caption NIMs

Documentation

  • Documentation aligned to a Helm-first supported path; Docker Compose for local development documented as unsupported developer tooling (not a production NIM deployment path)
  • Documentation consolidates extraction concepts, ingest workflow, embeddings, audio/video guides, prerequisites and support matrix, and UDF/custom stages in the graph README

Release Notes for Previous Versions

| 26.03 | 26.1.2 | 26.1.1 | 25.9.0 | 25.6.3 | 25.6.2 | 25.4.2 | 25.3.0 | 24.12.1 | 24.12.0