Milvus Configuration for NVIDIA RAG Blueprint#

You can configure how Milvus works with your NVIDIA RAG Blueprint.

GPU to CPU Mode Switch#

Milvus uses GPU acceleration by default for vector operations. Switch to CPU mode if you encounter:

  • GPU memory constraints

  • Development without GPU support

Docker compose#

Configuration Steps#

1. Update Docker Compose Configuration (vectordb.yaml)#

First, you need to modify the deploy/compose/vectordb.yaml file to disable GPU usage:

Step 1: Comment Out GPU Reservations Comment out the entire deploy section that reserves GPU resources:

# deploy:
#   resources:
#     reservations:
#       devices:
#         - driver: nvidia
#           capabilities: ["gpu"]
#           # count: ${INFERENCE_GPU_COUNT:-all}
#           device_ids: ['${VECTORSTORE_GPU_DEVICE_ID:-0}']

Step 2: Change the Milvus Docker Image

# Change this line:
image: milvusdb/milvus:v2.6.2-gpu # milvusdb/milvus:v2.6.2 for CPU

# To this:
image: milvusdb/milvus:v2.6.2 # milvusdb/milvus:v2.6.2-gpu for GPU

2. Set Environment Variables#

Before starting any services, you must set these environment variables in your terminal. These variables tell the ingestor server to use CPU mode:

# Set these environment variables BEFORE starting the ingestor server
export APP_VECTORSTORE_ENABLEGPUSEARCH=False
export APP_VECTORSTORE_ENABLEGPUINDEX=False

3. Restart Services#

After making the configuration changes and setting environment variables, restart the services:

# 1. Stop existing services
docker compose -f deploy/compose/vectordb.yaml down

# 2. Start Milvus and dependencies
docker compose -f deploy/compose/vectordb.yaml up -d

# 3. Now start the ingestor server
docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d

Switching Milvus to CPU Mode using Helm#

To configure Milvus to run in CPU mode when deploying with Helm:

  1. Disable GPU search and indexing by editing values.yaml.

    A. In the envVars and ingestor-server.envVars sections, set the following environment variables:

     ```yaml
     envVars:
     APP_VECTORSTORE_ENABLEGPUSEARCH: "False"
     ingestor-server:
     envVars:
         APP_VECTORSTORE_ENABLEGPUSEARCH: "False"
         APP_VECTORSTORE_ENABLEGPUINDEX: "False"
     ```
    

    B. Also, change the image under milvus.image.all to remove the -gpu tag.

     ```yaml
     milvus:
     image:
         all:
         repository: milvusdb/milvus
         tag: v2.5.17  # instead of v2.5.17-gpu
     ```
    

    C. (Optional) Remove or set GPU resource requests/limits to zero in the milvus.standalone.resources block.

     ```yaml
     milvus:
     standalone:
         resources:
         limits:
             nvidia.com/gpu: 0
     ```
    
  2. After you modify values.yaml, apply the changes as described in Change a Deployment.

(Optional) Customize the Milvus Endpoint#

To use a custom Milvus endpoint, use the following procedure.

  1. Update the APP_VECTORSTORE_URL and MINIO_ENDPOINT variables in both the RAG server and the ingestor server sections in values.yaml. Your changes should look similar to the following.

    env:
      # ... existing code ...
      APP_VECTORSTORE_URL: "http://your-custom-milvus-endpoint:19530"
      MINIO_ENDPOINT: "http://your-custom-minio-endpoint:9000"
      # ... existing code ...
    
    ingestor-server:
      env:
        # ... existing code ...
        APP_VECTORSTORE_URL: "http://your-custom-milvus-endpoint:19530"
        MINIO_ENDPOINT: "http://your-custom-minio-endpoint:9000"
        # ... existing code ...
    
    nv-ingest:
      envVars:
        # ... existing code ...
        MINIO_INTERNAL_ADDRESS: "http://your-custom-minio-endpoint:9000"
        # ... existing code ...
    
  2. Disable the Milvus deployment. Set milvusDeployed: false in the nv-ingest.milvusDeployed section to prevent deploying the default Milvus instance. Your changes should look like the following.

     nv-ingest:
       # ... existing code ...
       milvusDeployed: false
       # ... existing code ...
    
  3. Redeploy the Helm chart by running the following code.

    helm upgrade rag https://helm.ngc.nvidia.com/0648981100760671/charts/nvidia-blueprint-rag-v2.4.0-dev-dev.tgz -f nvidia-blueprint-rag/values.yaml -n rag
    

Milvus Authentication#

Enable authentication for Milvus to secure your vector database.

Docker Compose#

1. Configure Milvus Authentication#

Extract the default Milvus configuration:

docker cp milvus-standalone:/milvus/configs/milvus.yaml ./deploy/compose/

Edit deploy/compose/milvus.yaml to enable authentication:

security:
  authorizationEnabled: true
  defaultRootPassword: "your-secure-password"

Mount the configuration file in deploy/compose/vectordb.yaml by uncommenting the volume mount:

volumes:
  - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvus
  - ${MILVUS_CONFIG_FILE:-./milvus.yaml}:/milvus/configs/milvus.yaml

2. Start Services#

Start Milvus with authentication:

docker compose -f deploy/compose/vectordb.yaml up -d

Set authentication credentials and start RAG services:

export APP_VECTORSTORE_USERNAME="root"
export APP_VECTORSTORE_PASSWORD="your-secure-password"

docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d
docker compose -f deploy/compose/docker-compose-rag-server.yaml up -d

Helm Chart#

1. Configure Milvus Authentication in Helm:#

Configure Milvus Authentication

Edit deploy/helm/nvidia-blueprint-rag/files/milvus.yaml to enable authentication:

security:
  authorizationEnabled: true
  defaultRootPassword: "your-secure-password"

Create a ConfigMap from the milvus.yaml file:

kubectl create configmap milvus-config --from-file=milvus.yaml=deploy/helm/nvidia-blueprint-rag/files/milvus.yaml

Configure Volume Mounting

The values.yaml file includes the necessary volume configuration:

milvus:
  standalone:
    extraVolumes:
      - name: milvus-config
        configMap:
          name: milvus-config
    extraVolumeMounts:
      - name: milvus-config
        mountPath: /milvus/configs/milvus.yaml
        subPath: milvus.yaml

2. Configure username and password in deploy/helm/nvidia-blueprint-rag/values.yaml:#

rag-server:
  envVars:
    APP_VECTORSTORE_USERNAME: "root"
    APP_VECTORSTORE_PASSWORD: "your-secure-password"

ingestor-server:
  envVars:
    APP_VECTORSTORE_USERNAME: "root"
    APP_VECTORSTORE_PASSWORD: "your-secure-password"

3. Deploy with Helm:#

helm upgrade --install rag -n rag https://helm.ngc.nvidia.com/0648981100760671/charts/nvidia-blueprint-rag-v2.4.0-dev-dev-rc2.tgz \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
--set imagePullSecret.password=$NGC_API_KEY \
--set ngcApiSecret.password=$NGC_API_KEY \
-f deploy/helm/nvidia-blueprint-rag/values.yaml

For detailed HELM deployment instructions, see Helm Deployment Guide.

Troubleshooting#

GPU_CAGRA Error#

If you encounter GPU_CAGRA errors that cannot be resolved by when switching to CPU mode, try the following:

  1. Stop all running services:

    docker compose -f deploy/compose/vectordb.yaml down
    docker compose -f deploy/compose/docker-compose-ingestor-server.yaml down
    
  2. Delete the Milvus volumes directory:

    rm -rf deploy/compose/volumes
    
  3. Restart the services:

    docker compose -f deploy/compose/vectordb.yaml up -d
    docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d
    

Note

This will delete all existing vector data, so ensure you have backups if needed.