Enable query rewriting support for NVIDIA RAG Blueprint#

You can enable query rewriting for the NVIDIA RAG Blueprint. Query rewriting enables higher accuracy for multiturn queries by making an additional LLM call to decontextualize the incoming question, before sending it to the retrieval pipeline.

After you have deployed the blueprint, to enable query rewriting support, developers have the following options:

Using cloud hosted model#

  1. Set the server url to empty string to point towards cloud hosted model

    export APP_QUERYREWRITER_SERVERURL=""
    
  2. Relaunch the rag-server container by enabling query rewriter.

    export ENABLE_QUERYREWRITER="True"
    docker compose -f deploy/compose/docker-compose-rag-server.yaml up -d
    

Tip

You can change the model name and model endpoint in case of an externally hosted LLM model by setting these two environment variables and restarting the rag services

export APP_QUERYREWRITER_SERVERURL="<llm_nim_http_endpoint_url>"
export APP_QUERYREWRITER_MODELNAME="<model_name>"

Using Helm Chart (on-prem only)#

This section describes how to enable Query Rewriting when you deploy by using Helm, using an on-prem deployment of the LLM model.

Note

Only on-prem deployment of the LLM is supported. The model must be deployed separately using the NIM LLM Helm chart.

1. Enable Query Rewriter in rag-server Helm deployment#

  1. Modify the values.yaml file, in the envVars section, and set the following values.

       envVars:
          ##===Query Rewriter Model specific configurations===
          APP_QUERYREWRITER_MODELNAME: "nvidia/llama-3.3-nemotron-super-49b-v1.5"
          APP_QUERYREWRITER_SERVERURL: "nim-llm:8000"  # Fully qualified service name
          ENABLE_QUERYREWRITER: "True"
    

Follow the steps from Deploy with Helm and use the following command to deploy the chart.

helm install rag -n rag https://helm.ngc.nvidia.com/0648981100760671/charts/nvidia-blueprint-rag-v2.4.0-dev.tgz \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
--set imagePullSecret.password=$NGC_API_KEY \
   --set ngcApiSecret.password=$NGC_API_KEY \
   -f deploy/helm/nvidia-blueprint-rag/values.yaml