Environment Variables for NeMo Retriever Extraction
The following are the environment variables that you can use to configure NeMo Retriever extraction. You can specify these in your .env file or directly in your environment.
Note
NeMo Retriever extraction is also known as NVIDIA Ingest and nv-ingest.
General Environment Variables
| Name | Example | Description |
|---|---|---|
DOWNLOAD_LLAMA_TOKENIZER |
- | The Llama tokenizer is now pre-downloaded at build time. For details, refer to Token-Based Splitting. |
HF_ACCESS_TOKEN |
- | A token to access HuggingFace models. For details, refer to Token-Based Splitting. |
INGEST_LOG_LEVEL |
- DEBUG - INFO - WARNING - ERROR - CRITICAL |
The log level for the ingest service, which controls the verbosity of the logging output. |
MESSAGE_CLIENT_HOST |
- redis - localhost - 192.168.1.10 |
Specifies the hostname or IP address of the message broker used for communication between services. |
MESSAGE_CLIENT_PORT |
- 7670 - 6379 |
Specifies the port number on which the message broker is listening. |
MINIO_BUCKET |
nv-ingest |
Name of MinIO bucket, used to store image, table, and chart extractions. |
NGC_API_KEY |
nvapi-************* |
An authorized NGC API key, used to interact with hosted NIMs. To create an NGC key, go to https://org.ngc.nvidia.com/setup/api-keys. |
NIM_NGC_API_KEY |
— | The key that NIM microservices inside docker containers use to access NGC resources. This is necessary only in some cases when it is different from NGC_API_KEY. If this is not specified, NGC_API_KEY is used to access NGC resources. |
OTEL_EXPORTER_OTLP_ENDPOINT |
http://otel-collector:4317 |
The endpoint for the OpenTelemetry exporter, used for sending telemetry data. |
REDIS_INGEST_TASK_QUEUE |
ingest_task_queue |
The name of the task queue in Redis where tasks are stored and processed. |
IMAGE_STORAGE_URI |
s3://nv-ingest/artifacts/store/images |
Default fsspec-compatible URI for the store task. Supports s3://, file://, gs://, etc. See Store Extracted Images. |
IMAGE_STORAGE_PUBLIC_BASE_URL |
https://assets.example.com/images |
Optional HTTP(S) base URL for serving stored images. |
Library Mode Environment Variables
These environment variables apply specifically when running NV-Ingest in library mode.
| Name | Example | Description |
|---|---|---|
NVIDIA_API_KEY |
nvapi-************* |
API key for NVIDIA-hosted NIM services. |