Advanced Visual Parsing in NeMo Retriever Extraction
For scanned documents, or documents with complex layouts, we recommend that you use nemotron-parse. Nemotron parse provides higher-accuracy text extraction.
This documentation describes the following two methods to run NeMo Retriever extraction with nemotron-parse.
- Run the NIM locally by using Docker Compose
- Use NVIDIA Cloud Functions (NVCF) endpoints for cloud-based inference
Note
NeMo Retriever extraction is also known as NVIDIA Ingest and nv-ingest.
Limitations
Currently, the limitations to using nemotron-parse with NeMo Retriever Extraction are the following:
- Extraction with
nemotron-parseonly supports PDFs, not image files. For more information, refer to Troubleshoot Nemo Retriever Extraction. nemotron-parseis not supported on RTX Pro 6000, B200, or H200 NVL. For more information, refer to the Nemotron Parse Support Matrix.
Run the NIM Locally by Using Docker Compose
Use the following procedure to run the NIM locally.
Important
Due to limitations in available VRAM controls in the current release of nemotron-parse, it must run on a dedicated additional GPU. Edit docker-compose.yaml to set nemotron-parse's device_id to a dedicated GPU: device_ids: ["1"] or higher.
-
Start the nv-ingest services with the
nemotron-parseprofile. This profile includes the necessary components for extracting text and metadata from images. Use the following command.- The --profile nemotron-parse flag ensures that vision-language retrieval services are launched. For more information, refer to Profile Information.
docker compose --profile nemotron-parse up -
After the services are running, you can interact with nv-ingest by using Python.
- The
Ingestorobject initializes the ingestion process. - The
filesmethod specifies the input files to process. - The
extractmethod tells nv-ingest to usenemotron-parsefor extracting text and metadata from images. - The
document_typeparameter is optional, becauseIngestorshould detect the file type automatically.
ingestor = ( Ingestor() .files("./data/*.pdf") .extract( document_type="pdf", # Ingestor should detect type automatically in most cases extract_method="nemotron_parse" ) )Tip
For more Python examples, refer to NV-Ingest: Python Client Quick Start Guide.
- The
Using NVCF Endpoints for Cloud-Based Inference
Instead of running NV-Ingest locally, you can use NVCF to perform inference by using remote endpoints.
-
Set the authentication token in the
.envfile.NVIDIA_API_KEY=nvapi-... -
Modify
docker-compose.yamlto use the hostednemotron-parseservice.# build.nvidia.com hosted nemotron-parse - NEMOTRON_PARSE_HTTP_ENDPOINT=https://integrate.api.nvidia.com/v1/chat/completions #- NEMOTRON_PARSE_HTTP_ENDPOINT=http://nemotron-parse:8000/v1/chat/completions -
Run inference by using Python.
- The
Ingestorobject initializes the ingestion process. - The
filesmethod specifies the input files to process. - The
extractmethod tells nv-ingest to usenemotron-parsefor extracting text and metadata from images. - The
document_typeparameter is optional, becauseIngestorshould detect the file type automatically.
ingestor = ( Ingestor() .files("./data/*.pdf") .extract( document_type="pdf", # Ingestor should detect type automatically in most cases extract_method="nemotron_parse" ) )Tip
For more Python examples, refer to NV-Ingest: Python Client Quick Start Guide.
- The