Skip to content

Vector databases

Use this documentation to learn how NeMo Retriever Library stores extracted embeddings and uploads data to vector databases.

On this page

Overview

NeMo Retriever Library supports extracting text representations of various forms of content, and ingesting to a vector database. LanceDB is the vector database backend for storing and retrieving extracted embeddings.

The data upload task (vdb_upload) pulls extraction results to the Python client, and then pushes them to LanceDB (embedded, in-process).

The vector database stores only the extracted text representations of ingested data. It does not store the embeddings for images.

Storing Extracted Images

To persist extracted images, tables, and chart renderings to disk or object storage, use the store task in addition to vdb_upload. The store task supports any fsspec-compatible backend (local filesystem, S3, GCS, and other object stores). For details, refer to Store Extracted Images.

NeMo Retriever Library supports uploading data by using the Ingestor.vdb_upload API. Currently, data upload is not supported through the CLI.

Why LanceDB?

LanceDB is optimized for low-latency retrieval in this stack:

  • Lance columnar format — Data is stored in Lance files, an Arrow/Parquet-style analytics layout optimized for fast local scans and indexed retrieval. This reduces serialization overhead compared with a separate database server.
  • IVF_HNSW_SQ index — Vectors are scalar-quantized (SQ) within an IVF-HNSW index, compressing them for faster search with lower memory bandwidth cost.
  • Embedded runtime — LanceDB runs in-process, so you do not run extra vector-database containers for the default path. Fewer moving parts to start, configure, and maintain.

This combination of file format, index strategy, and in-process runtime supports the latency characteristics described in benchmarks.

Upload to LanceDB

LanceDB uses the LanceDB operator class from the client library. You can configure it via the Python API.

Programmatic API (Python)

Pass vdb_op="lancedb" to vdb_upload, or construct a LanceDB instance and pass it as vdb_op:

from nemo_retriever.vdb.lancedb import LanceDB

vdb = LanceDB(
    uri="./lancedb_data",    # Path to LanceDB database directory
    table_name="nemo-retriever",  # Table name
    index_type="IVF_HNSW_SQ",  # Index type (default)
)

# Ingest
vdb.run(results)

# Retrieve with precomputed query vectors
docs = vdb.retrieval(queries, top_k=10)

Query ingested tables with LanceDB.retrieval() (precomputed vectors) or with Retriever.query (embeds the query string for you). Optional where predicates and client-side filters are documented under Metadata and filtering.

When using the Ingestor with vdb_upload, pass vdb_op="lancedb" or a LanceDB instance so uploads target LanceDB. If you omit vdb_op, the ingestion Python client still defaults the string argument to "milvus" for backward compatibility, which is not the LanceDB operator—always pass vdb_op="lancedb" when you intend LanceDB.

Semantic retrieval

Semantic retrieval uses dense embeddings to find content that is similar in meaning to a query. In NeMo Retriever Library, the default vector path is LanceDB. Use these resources together with the sections on this page:

Evaluation — For evaluation and metrics, refer to Evaluate on your data.

Metadata and filtering

This page covers LanceDB upload and retrieval. Metadata is not duplicated here.

LanceDB deployment characteristics

Aspect LanceDB
Runtime model Embedded (in-process)
External services None for the vector store itself
Helm / extra stack Not required for LanceDB (default path)
Index type IVF_HNSW_SQ (default)
Persistence Lance files on disk under your configured URI

Upload to a Custom Data Store

You can ingest to other data stores by using the Ingestor.vdb_upload method; however, you must configure other data stores and connections yourself. NeMo Retriever Library does not provide connections to other data sources.

Vector database partners

NeMo Retriever Library integrates with vector databases used for RAG collections. The sections above focus on LanceDB as the shipped backend. This section lists that backend and how partner or custom VDB subclasses plug into graph operators. For chunking behavior, see Chunking.

Backends with VDB implementations (retriever adapters)

NeMo Retriever graph operators IngestVdbOperator and RetrieveVdbOperator wrap concrete classes that implement the VDB interface (run for ingest, retrieval for search). The library ships one first-party backend:

Backend Project Implementation
LanceDB LanceDB · documentation lancedb.py — pass vdb_op="lancedb" (recommended).

On the ingestion Python client's Ingestor.vdb_upload, omitting vdb_op does not select LanceDB; see Upload to LanceDB.

Pass vdb_op="lancedb" or a LanceDB instance. To integrate another vector database, subclass VDB and pass your operator instance as vdb (see Build a Custom Vector Database Operator).

RAG Blueprint and partner vector stores

Some deployments use a different vector store than the default LanceDB path on this page—for example the NVIDIA RAG Blueprint (Docker Compose or Helm) or a partner package that subclasses the same VDB interface. Use the following public references when you wire those stacks to ingestion and retrieval:

Vector store Where to configure or implement
Elasticsearch Configure Elasticsearch as Your Vector Database for NVIDIA RAG Blueprint — compose profiles, environment variables, and Helm notes for the RAG Blueprint.
Pinecone Customize your vector database (Pinecone + NVIDIA RAG) in the pinecone-io/nvidia-pinecone-rag repository.
Teradata TeradataVDB (NVIDIA NIM Ingest integration)teradatagenai.vector_store.teradataVDB.TeradataVDB implements the NeMo Retriever ingestion VDB abstract class for Teradata Vector Store.

Testing and release cadence for these integrations follow the owning project (RAG Blueprint, Pinecone sample repo, or Teradata Generative AI package), not the first-party LanceDB operator validated for NeMo Retriever Library on this page.

More information (embeddings & custom VDB)

Important

NVIDIA documents and validates the first-party LanceDB operator for this library. If you integrate a different vector store, you are responsible for testing and maintaining that integration.

To implement a custom operator, follow the VDB abstract interface described in Build a Custom Vector Database Operator.