Use NeMo Retriever Parse

Use Nemo Retriever Extraction with nemoretriever-parse

This documentation describes two methods to run NeMo Retriever extraction with nemoretriever-parse.

Run the NIM locally by using Docker Compose
Use NVIDIA Cloud Functions (NVCF) endpoints for cloud-based inference

Note

NeMo Retriever extraction is also known as NVIDIA Ingest and nv-ingest.

Currently, extraction with nemoretriever-parse only supports PDFs, not image files. For more information, see Troubleshoot Nemo Retriever Extraction.

Run the NIM Locally by Using Docker Compose

Use the following procedure to run the NIM locally.

Important

Due to limitations in available VRAM controls in the current release of nemoretriever_parse, it must run on a dedicated additional GPU. Edit docker-compose.yaml to set nemoretriever_parse's device_id to a dedicated GPU: device_ids: ["1"] or higher.

Start the nv-ingest services with the nemoretriever-parse profile. This profile includes the necessary components for extracting text and metadata from images. Use the following command.
- The --profile nemoretriever-parse flag ensures that vision-language retrieval services are launched. For more information, refer to Profile Information.
- The --build flag ensures that any changes to the container images are applied before starting.
```
docker compose --profile nemoretriever-parse up --build
```
After the services are running, you can interact with nv-ingest by using Python.
- The Ingestor object initializes the ingestion process.
- The files method specifies the input files to process.
- The extract method tells nv-ingest to use nemoretriever-parse for extracting text and metadata from images.
- The document_type parameter is optional, because Ingestor should detect the file type automatically.
```
ingestor = (
    Ingestor()
    .files("./data/*.pdf")
    .extract(
        document_type="pdf",  # Ingestor should detect type automatically in most cases
        extract_method="nemoretriever_parse"
    )
)
```
Tip

For more Python examples, refer to NV-Ingest: Python Client Quick Start Guide.

Using NVCF Endpoints for Cloud-Based Inference

Instead of running NV-Ingest locally, you can use NVCF to perform inference by using remote endpoints.

Set the authentication token in the .env file.
```
NVIDIA_BUILD_API_KEY=nvapi-...
```

Modify docker-compose.yaml to use the hosted nemoretriever-parse service.

# build.nvidia.com hosted nemoretriever-parse
- NEMORETRIEVER_PARSE_HTTP_ENDPOINT=https://integrate.api.nvidia.com/v1/chat/completions
#- NEMORETRIEVER_PARSE_HTTP_ENDPOINT=http://nemoretriever-parse:8000/v1/chat/completions

Run inference by using Python.
- The Ingestor object initializes the ingestion process.
- The files method specifies the input files to process.
- The extract method tells nv-ingest to use nemoretriever-parse for extracting text and metadata from images.
- The document_type parameter is optional, because Ingestor should detect the file type automatically.
```
ingestor = (
    Ingestor()
    .files("./data/*.pdf")
    .extract(
        document_type="pdf",  # Ingestor should detect type automatically in most cases
        extract_method="nemoretriever_parse"
    )
)
```
Tip

For more Python examples, refer to NV-Ingest: Python Client Quick Start Guide.