Enable Hybrid Search Support for NVIDIA RAG Blueprint#

You can enable hybrid search for NVIDIA RAG Blueprint. Hybrid search enables higher accuracy for documents having more domain specific technical jargons. It combines sparse and dense representations to leverage the strengths of both retrieval methods—sparse models (e.g., BM25) excel at keyword matching, while dense embeddings (e.g., vector-based search) capture semantic meaning. This allows hybrid search to retrieve relevant documents even when technical jargon or synonyms are used.

After you have deployed the blueprint, to enable hybrid search support for Milvus Vector Database, developers can follow below steps:

Steps#

Set the search type to hybrid

export APP_VECTORSTORE_SEARCHTYPE="hybrid"

Relaunch the rag and ingestion services

docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d
docker compose -f deploy/compose/docker-compose-rag-server.yaml up -d

Weighted Hybrid Search#

By default, hybrid search uses RRF (Reciprocal Rank Fusion) for combining results. You can also use weighted hybrid search, which allows you to control the relative importance of dense (semantic) and sparse (keyword) search results.

Docker Compose#

To enable weighted hybrid search with Docker Compose:

Set the ranker type to weighted and configure the weights:

export APP_VECTORSTORE_SEARCHTYPE="hybrid"
export APP_VECTORSTORE_RANKER_TYPE="weighted"
export APP_VECTORSTORE_DENSE_WEIGHT="0.5"   # Weight for semantic search (default: 0.5)
export APP_VECTORSTORE_SPARSE_WEIGHT="0.5"  # Weight for keyword search (default: 0.5)

Relaunch the rag and ingestion services:

docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d
docker compose -f deploy/compose/docker-compose-rag-server.yaml up -d

Note

You can adjust the weights based on your use case. For example, if your queries rely more on semantic understanding, increase the dense weight (e.g., 0.7 dense, 0.3 sparse). For more keyword-based search, increase the sparse weight.

Helm#

To enable hybrid search using Helm deployment:

Basic Hybrid Search (RRF)#

Modify values.yaml to enable hybrid search with RRF (Reciprocal Rank Fusion):

# Environment variables for rag-server
envVars:
  # ... existing configurations ...
  
  # === Hybrid Search Configuration ===
  APP_VECTORSTORE_SEARCHTYPE: "hybrid"

# Configure ingestor-server for hybrid search
ingestor-server:
  envVars:
    # ... existing configurations ...
    
    APP_VECTORSTORE_SEARCHTYPE: "hybrid"

Weighted Hybrid Search with Helm#

To use weighted hybrid search instead of RRF, modify values.yaml:

# Environment variables for rag-server
envVars:
  # ... existing configurations ...
  
  # === Weighted Hybrid Search Configuration ===
  APP_VECTORSTORE_SEARCHTYPE: "hybrid"
  APP_VECTORSTORE_RANKER_TYPE: "weighted"
  APP_VECTORSTORE_DENSE_WEIGHT: "0.5"   # Weight for semantic search (default: 0.5)
  APP_VECTORSTORE_SPARSE_WEIGHT: "0.5"  # Weight for keyword search (default: 0.5)

# Configure ingestor-server for weighted hybrid search
ingestor-server:
  envVars:
    # ... existing configurations ...
    
    APP_VECTORSTORE_SEARCHTYPE: "hybrid"
    APP_VECTORSTORE_RANKER_TYPE: "weighted"
    APP_VECTORSTORE_DENSE_WEIGHT: "0.5"
    APP_VECTORSTORE_SPARSE_WEIGHT: "0.5"

Apply the Configuration#

After modifying values.yaml, apply the changes as described in Change a Deployment.

For detailed HELM deployment instructions, see Helm Deployment Guide.

Note

Elasticsearch RRF Limitation: RRF (Reciprocal Rank Fusion) is not supported in the open-source version of Elasticsearch. If you’re using open-source Elasticsearch with hybrid search, you must use the weighted ranker type instead of rrf. Attempting to use RRF with open-source Elasticsearch will result in the following error: RRF (Reciprocal Rank Fusion) requires an Elasticsearch commercial license.

Note

Preexisting collections in Milvus created using search type dense won’t work, when the search type is changed to hybrid. If you are switching the search type, ensure you are creating new collection and re-uploading documents before doing retrieval.