Skip to content

Advanced Visual Parsing with Nemotron Parse

For scanned documents, or documents with complex layouts, we recommend that you use nemotron-parse. Nemotron parse provides higher-accuracy text extraction.

This documentation describes the following three methods to run NeMo Retriever Library with nemotron-parse.

  • Run the NIM locally by using Docker Compose
  • Use NVIDIA Cloud Functions (NVCF) endpoints for cloud-based inference
  • Run the Ray batch pipeline with nemotron-parse (library mode)

Note

NVIDIA Ingest (nv-ingest) has been renamed NeMo Retriever Library.

Limitations

Currently, the limitations to using nemotron-parse with NeMo Retriever Library are the following:

Run the NIM Locally by Using Docker Compose

Use the following procedure to run the NIM locally.

Important

Due to limitations in available VRAM controls in the current release of nemotron-parse, it must run on a dedicated additional GPU. Edit docker-compose.yaml to set nemotron-parse's device_id to a dedicated GPU: device_ids: ["1"] or higher.

  1. Start the retriever services with the nemotron-parse profile. This profile includes the necessary components for extracting text and metadata from images. Use the following command.

    • The --profile nemotron-parse flag ensures that vision-language retrieval services are launched. For more information, refer to Profile Information.
    docker compose --profile nemotron-parse up
    
  2. After the services are running, you can interact with the pipeline by using Python.

    • The Ingestor object initializes the ingestion process.
    • The files method specifies the input files to process.
    • The extract method tells the pipeline to use nemotron-parse for extracting text and metadata from images.
    • The document_type parameter is optional, because Ingestor should detect the file type automatically.
    ingestor = (
        Ingestor()
        .files("./data/*.pdf")
        .extract(
            document_type="pdf",  # Ingestor should detect type automatically in most cases
            extract_method="nemotron_parse"
        )
    )
    

    Tip

    For more Python examples, refer to NV-Ingest: Python Client Quick Start Guide.

Using NVCF Endpoints for Cloud-Based Inference

Instead of running the pipeline locally, you can use NVCF to perform inference by using remote endpoints.

  1. Set the authentication token in the .env file.

    NVIDIA_API_KEY=nvapi-...
    
  2. Modify docker-compose.yaml to use the hosted nemotron-parse service.

    # build.nvidia.com hosted nemotron-parse
    - NEMOTRON_PARSE_HTTP_ENDPOINT=https://integrate.api.nvidia.com/v1/chat/completions
    #- NEMOTRON_PARSE_HTTP_ENDPOINT=http://nemotron-parse:8000/v1/chat/completions
    
  3. Run inference by using Python.

    • The Ingestor object initializes the ingestion process.
    • The files method specifies the input files to process.
    • The extract method tells the pipeline to use nemotron-parse for extracting text and metadata from images.
    • The document_type parameter is optional, because Ingestor should detect the file type automatically.
    ingestor = (
        Ingestor()
        .files("./data/*.pdf")
        .extract(
            document_type="pdf",  # Ingestor should detect type automatically in most cases
            extract_method="nemotron_parse"
        )
    )
    

    Tip

    For more Python examples, refer to NV-Ingest: Python Client Quick Start Guide.