NeMo Retriever OCR Configuration Guide for NVIDIA RAG Blueprint (Early Access)#
You can enable NeMO Retriever OCR for your NVIDIA RAG Blueprint. NeMo Retriever OCR is an advanced optical character recognition service that provides enhanced text extraction capabilities for document processing workflows. It serves as a high-performance alternative to the default Paddle OCR service, offering significant improvements in speed and resource efficiency.
For more information about NeMo Retriever OCR, refer to NeMo Retriever OCR v1 Container.
Note
Early Access: Currently, the NeMo Retriever OCR v1 container is in early access preview.
Key Benefits of NeMo Retriever OCR#
Performance: Nemo-retriever OCR based pipeline is more than 2x faster than Paddle OCR for PDF ingestion tasks. NeMo Retriever OCR can be used for High-volume document processing in batch ingestion workflows.
Considerations for NeMo Retriever OCR#
GPU Requirements: Check the NeMo Retriever OCR Support Matrix for detailed hardware requirements and supported GPUs
GPU Memory: Requires approximately ~4.1GB of GPU memory [Triton server (~2.3GB) + Python backend processes(~1.8GB)] - ensure sufficient GPU memory is available
Quality Variance: Extraction quality may vary based on image quality and text complexity
Early Access: Currently in preview - monitor for updates and stability improvements
How to Enable NeMo Retriever OCR#
Docker Compose Deployment for NeMo Retriever OCR#
Self-Hosted Deployment Configuration#
Prerequisites: Follow the deployment guide up to and including the step labelled “Start all required NIMs.”
Configure Environment Variables:
export OCR_GRPC_ENDPOINT=nemoretriever-ocr:8001 export OCR_HTTP_ENDPOINT=http://nemoretriever-ocr:8000/v1/infer export OCR_INFER_PROTOCOL=grpc export OCR_MODEL_NAME=scene_text_ensemble
Warning
Critical Health Check Requirement: Even when using gRPC protocol (
OCR_INFER_PROTOCOL=grpc), you must also export theOCR_HTTP_ENDPOINTbecause the health check from nv-ingest uses HTTP.Stop Paddle OCR deployment if already running:
USERID=$(id -u) docker compose -f deploy/compose/nims.yaml down paddle
Deploy NeMo Retriever OCR Service:
USERID=$(id -u) docker compose -f deploy/compose/nims.yaml --profile nemoretriever-ocr up -d
Verify Service Status:
watch -n 2 'docker ps --format "table {{.Names}}\t{{.Status}}"'
Restart Ingestor Server:
docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d
Test Document Ingestion: Use the ingestion API usage notebook to verify functionality.
NVIDIA-Hosted Deployment Configuration#
Prerequisites: Follow the deployment guide up to and including the step labelled “Start the vector db containers from the repo root.”
Configure API Endpoints:
export OCR_HTTP_ENDPOINT=https://ai.api.nvidia.com/v1/cv/nvidia/nemoretriever-ocr export OCR_INFER_PROTOCOL=http export OCR_MODEL_NAME=scene_text_ensemble
Deploy Services: Continue with the remaining steps in the deployment guide to deploy ingestion-server and rag-server containers.
Test Document Ingestion: Use the ingestion API usage notebook to verify functionality.
Note
Default Behavior: Paddle OCR is the default OCR service and runs automatically when you start the NIMs. To use NeMo Retriever OCR instead, you must explicitly start it with the --profile nemoretriever-ocr flag.
Helm Deployment to Enable NeMo Retriever OCR#
To enable NeMo Retriever OCR using Helm, configure the deployment to enable NeMo Retriever OCR and disable Paddle OCR for resource optimization:
# Apply to a fresh deployment (recommended to uninstall existing deployments first)
# helm uninstall rag -n rag
helm upgrade --install rag -n rag https://helm.ngc.nvidia.com/0648981100760671/charts/nvidia-blueprint-rag-v2.4.0-dev.tgz \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
--set nv-ingest.paddleocr-nim.deployed=false \
--set nv-ingest.nemoretriever-ocr.deployed=true \
--set nv-ingest.envVars.OCR_MODEL_NAME="scene_text_ensemble" \
--set imagePullSecret.password=$NGC_API_KEY \
--set ngcApiSecret.password=$NGC_API_KEY
NeMo Retriever OCR Configuration Options#
Required Environment Variables#
Variable |
Description |
Default |
Required |
|---|---|---|---|
|
gRPC endpoint for OCR service |
|
Yes (on-premises) |
|
HTTP endpoint for OCR service |
|
Yes |
|
Communication protocol |
|
Yes |
|
OCR model to use |
|
Yes |
Available OCR Service Options#
The system supports two OCR service options:
NeMo Retriever OCR: Enhanced text extraction optimized for document processing
Paddle OCR: Default OCR service for general text extraction
Hardware Requirements and Support Matrix#
For detailed information about hardware requirements and supported GPUs, refer to the NeMo Retriever OCR Support Matrix.