Enable Text-Only Ingestion Support in Docker for NVIDIA RAG Blueprint#
You can enable text-only ingestion for the NVIDIA RAG Blueprint. For ingesting text only files, developers do not need to deploy the complete pipeline with all NIMs connected. If your use case requires extracting text from files, follow steps below to deploy just the necessary components.
Follow the deployment guide up to and including the step labelled “Start all required NIMs.”
Set the environment variables to enable text-only extraction mode:
export APP_NVINGEST_EXTRACTTEXT=True export APP_NVINGEST_EXTRACTINFOGRAPHICS=False export APP_NVINGEST_EXTRACTTABLES=False export APP_NVINGEST_EXTRACTCHARTS=False export COMPONENTS_TO_READY_CHECK=""
Important
When disabling nv-ingest dependent services, you must set
COMPONENTS_TO_READY_CHECK=""to ensure the nv-ingest container reaches ready state. Without this setting, nv-ingest will wait indefinitely for the disabled components.Then deploy the ingestor-server:
docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d ingestor-server
While deploying the NIMs in step 4, selectively deploy just the NIMs necessary for rag-server and ingestion in text-only mode.
USERID=$(id -u) docker compose --profile rag -f deploy/compose/nims.yaml up -d
Confirm all the below mentioned NIMs are running and the one’s specified below are in healthy state before proceeding further. Make sure to allocate GPUs according to your hardware (2xH100, 2xB200 or 4xA100 to
nim-llm-msbased on your deployment GPU profile) as stated in the quickstart guide.watch -n 2 'docker ps --format "table {{.Names}}\t{{.Status}}"'
NAMES STATUS nemoretriever-ranking-ms Up 14 minutes (healthy) nemoretriever-embedding-ms Up 14 minutes (healthy) nim-llm-ms Up 14 minutes (healthy)
Continue following the rest of steps in deployment guide to deploy the rag-server containers.
Once the ingestion and rag servers are deployed, open the ingestion notebook and follow the steps. While trying out the
Upload Document Endpointset the payload to below.data = { "vdb_endpoint": "http://milvus:19530", "collection_name": collection_name, "split_options": { "chunk_size": 1024, "chunk_overlap": 150 } }
After ingestion completes, you can try out the queries relevant to the text in the documents using retrieval notebook.
Note
In case you are interacting with cloud hosted models and want to enable text only mode, then in step 2, just export these specific environment variables as shown below:
export APP_EMBEDDINGS_SERVERURL=""
export APP_LLM_SERVERURL=""
export APP_RANKING_SERVERURL=""
export YOLOX_HTTP_ENDPOINT="https://ai.api.nvidia.com/v1/cv/nvidia/nemoretriever-page-elements-v3"
export YOLOX_INFER_PROTOCOL="http"
Enable text only ingestion support in Helm#
To ingest text-only files, you do not need to deploy the complete pipeline with all NIMs connected. If your scenario requires only text extraction from files, use the following steps to deploy only the necessary components using Helm.
When you install the Helm chart, enable only the following services that are required for text ingestion:
rag-serveringestor-servernv-ingestnvidia-nim-llama-32-nv-embedqa-1b-v2text-reranking-nimnim-llmmilvusminio
Additionally, ensure that table extraction, chart extraction, and image extraction are disabled.
First, modify the environment variables in
values.yamlto enable text-only extraction:In the
nv-ingest.envVarssection, set the following values:nv-ingest: envVars: # ... existing configurations ... # === Text-Only Extraction Mode === APP_NVINGEST_EXTRACTTEXT: "True" APP_NVINGEST_EXTRACTINFOGRAPHICS: "False" APP_NVINGEST_EXTRACTTABLES: "False" APP_NVINGEST_EXTRACTCHARTS: "False"
Then use the modified
values.yamlfile in your Helm upgrade command:
helm upgrade --install rag -n rag https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-rag-v2.4.0.tgz \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
--values deploy/helm/nvidia-blueprint-rag/values.yaml \
--set nimOperator.nim-llm.enabled=true \
--set nimOperator.nvidia-nim-llama-32-nv-embedqa-1b-v2.enabled=true \
--set nimOperator.nvidia-nim-llama-32-nv-rerankqa-1b-v2.enabled=true \
--set ingestor-server.enabled=true \
--set nv-ingest.enabled=true \
--set nv-ingest.nimOperator.page_elements.enabled=false \
--set nv-ingest.nimOperator.graphic_elements.enabled=false \
--set nv-ingest.nimOperator.table_structure.enabled=false \
--set nv-ingest.nimOperator.nemoretriever_ocr_v1.enabled=false \
--set imagePullSecret.password=$NGC_API_KEY \
--set ngcApiSecret.password=$NGC_API_KEY
Important
Disabling NV-Ingest Components for GPU Resource Management:
If you disable any nv-ingest dependent services (such as table_structure, graphic_elements, nemoretriever_ocr_v1, etc.) to free up GPU resources for customization, you must set the COMPONENTS_TO_READY_CHECK parameter to an empty string in the nv-ingest.envVars section of your values.yaml file:
nv-ingest:
envVars:
COMPONENTS_TO_READY_CHECK: ""
This ensures the nv-ingest pod reaches ready state even when some dependent components are disabled. Without this setting, the nv-ingest pod will wait indefinitely for the disabled components to become ready.