Deploy Without Containers (Library Mode) for NeMo Retriever Library
Note
NVIDIA Ingest (nv-ingest) has been renamed NeMo Retriever Library.
Use the Quick Start for NeMo Retriever Library to set up and run the NeMo Retriever Library locally, so you can build a GPU‑accelerated, multimodal RAG ingestion pipeline that parses PDFs, HTML, text, audio, and video into LanceDB vector embeddings, integrates with Nemotron RAG models (locally or via NIM endpoints), which includes Ray‑based scaling with built‑in recall evaluation. Python 3.12 or later is required (see Prerequisites).
run_pipeline
The primary Python entry point for launching the Ray-based ingestion pipeline in library mode is run_pipeline in nv_ingest.framework.orchestration.ray.util.pipeline.pipeline_runners.
from nv_ingest.framework.orchestration.ray.util.pipeline.pipeline_runners import run_pipeline
Parameters
The following table matches the function signature in source (defaults and optionality). None of these parameters are required in the sense of having no default; omit them to use the defaults shown.
| Parameter | Required | Type (default) | Description |
|---|---|---|---|
pipeline_config |
No | Optional[PipelineConfigSchema] (None) |
Validated pipeline configuration. If None and libmode=True, the default library-mode pipeline is loaded automatically. If None and libmode=False, a ValueError is raised—you must pass a configuration. |
block |
No | bool (True) |
If True, the call blocks until the pipeline finishes. If False, returns immediately with a handle object (see Return type). |
disable_dynamic_scaling |
No | Optional[bool] (None) |
If set, overrides the same field from the pipeline configuration. |
dynamic_memory_threshold |
No | Optional[float] (None) |
If set, overrides the same field from the pipeline configuration. |
run_in_subprocess |
No | bool (False) |
If True, runs the pipeline in a separate Python subprocess (multiprocessing.Process). If False, runs in the current process. |
stdout |
No | Optional[TextIO] (None) |
When using a subprocess, optional stream for child stdout; if None, stdout is discarded. |
stderr |
No | Optional[TextIO] (None) |
When using a subprocess, optional stream for child stderr; if None, stderr is discarded. |
libmode |
No | bool (True) |
If True and pipeline_config is None, loads the default library-mode pipeline. If False, pipeline_config must be provided. |
quiet |
No | Optional[bool] (None) |
If True, reduces logging noise for library use. If None, defaults to True when libmode=True. |
Return type
run_pipeline returns a union of three possible types, depending on block and run_in_subprocess:
| Mode | Return type | Notes |
|---|---|---|
In-process, block=True |
float |
Elapsed time in seconds. |
In-process, block=False |
RayPipelineInterface |
Handle to control the in-process pipeline (defined in nv_ingest.framework.orchestration.ray.primitives.ray_pipeline). |
Subprocess, block=False |
RayPipelineSubprocessInterface |
Handle to control the subprocess-based pipeline (same module). This is not RayPipelineInterface; the two classes are separate implementations of PipelineInterface. Use isinstance(..., RayPipelineSubprocessInterface) when you launch with run_in_subprocess=True and block=False. |
Subprocess, block=True |
float |
Returns 0.0 when blocking in subprocess mode. |
For the authoritative contract (including raised exceptions), refer to the docstring on run_pipeline in src/nv_ingest/framework/orchestration/ray/util/pipeline/pipeline_runners.py.