nemo_retriever.model package#

Subpackages#

Submodules#

nemo_retriever.model.model module#

class nemo_retriever.model.model.BaseModel[source]#

Bases: ABC

Abstract base class for all models.

This class must never be instantiated directly.

abstract property input: Any#

Input schema or object.

abstract property input_batch_size: int#

Maximum or default input batch size.

abstract property model_name: str#

Human-readable model name.

abstract property model_runmode: Literal['local', 'NIM', 'build-endpoint']#

local, NIM, or build-endpoint.

Type:

Execution mode

abstract property model_type: str#

Model category/type (e.g. llm, vision, embedding).

abstract property output: Any#

Output schema or object.

class nemo_retriever.model.model.HuggingFaceModel(model_id: str)[source]#

Bases: BaseModel

Abstract base class for all HuggingFace models.

abstract property input_shape: Tuple[int, int]#

Input shape.

abstract property model: nn.Module#

Model instance.

abstract property model_dir: str#

Model directory.

class nemo_retriever.model.model.NvidiaNIMModel(model_id: str)[source]#

Bases: BaseModel

Abstract base class for all Nvidia NIM models.

abstract property model_dir: str#

Model directory.

Module contents#

nemo_retriever.model.create_local_embedder(
model_name: str | None = None,
*,
backend: str = 'vllm',
device: str | None = None,
hf_cache_dir: str | None = None,
gpu_memory_utilization: float = 0.45,
enforce_eager: bool = False,
dimensions: int | None = None,
normalize: bool = True,
max_length: int = 8192,
query_max_length: int = 128,
) Any[source]#

Create the appropriate local embedding model (VL or non-VL).

backend must be "vllm" or "hf".

For non-VL models:

  • backend="vllm" (default): vLLM via LlamaNemotronEmbed1BV2Embedder.

  • backend="hf": HuggingFace via LlamaNemotronEmbed1BV2HFEmbedder.

For VL models:

  • backend="vllm" (default): vLLM via LlamaNemotronEmbedVL1BV2VLLMEmbedder.

  • backend="hf": HuggingFace via LlamaNemotronEmbedVL1BV2Embedder.

device applies only to HuggingFace paths. For vLLM paths, device is forwarded for compatibility but deprecated and ignored (vLLM placement is process-level); passing it emits DeprecationWarning.

Note: gpu_memory_utilization, enforce_eager, dimensions, normalize, and max_length apply to vLLM paths only; the HF VL path ignores them.

nemo_retriever.model.create_local_query_embedder(
model_name: str | None = None,
*,
backend: str = 'hf',
device: str | None = None,
hf_cache_dir: str | None = None,
gpu_memory_utilization: float = 0.45,
enforce_eager: bool = False,
dimensions: int | None = None,
normalize: bool = True,
max_length: int = 8192,
query_max_length: int = 128,
) Any[source]#

Create a local embedder for query vectors in retrieval (Retriever / recall).

backend must be "hf" (default) or "vllm".

  • backend="hf": HuggingFace for both VL and non-VL models.

  • backend="vllm": vLLM for both VL and non-VL models.

nemo_retriever.model.create_local_reranker(
model_name: str | None = None,
*,
device: str | None = None,
hf_cache_dir: str | None = None,
backend: str = 'vllm',
gpu_memory_utilization: float = 0.5,
) BaseModel[source]#

Create the appropriate local reranker model (VL or text-only).

Dispatches to NemotronRerankVLV2VLLM (default) or NemotronRerankVLV2 when model_name matches a VL reranker ID, depending on backend. Otherwise returns the text-only NemotronRerankV2.

Parameters:
  • backend"vllm" (default) uses vLLM’s pooling runner for the VL reranker. "hf" uses HuggingFace AutoModelForSequenceClassification. Only affects VL reranker dispatch; the text-only reranker always uses HuggingFace.

  • gpu_memory_utilization – Fraction of GPU memory for the vLLM engine (only used when backend is "vllm").

nemo_retriever.model.is_vl_embed_model(model_name: str | None) bool[source]#

Return True if model_name refers to the VL embedding model.

nemo_retriever.model.is_vl_rerank_model(model_name: str | None) bool[source]#

Return True if model_name refers to the VL reranker model.

nemo_retriever.model.normalize_backend(
value: str | None,
valid: frozenset[str],
*,
field_name: str,
default: str,
) str[source]#

Normalize value (strip + lowercase) and validate against valid.

Raises ValueError referencing field_name on invalid input. Falsy value is replaced by default before validation.

nemo_retriever.model.resolve_embed_model(model_name: str | None) str[source]#

Resolve a model name/alias to a full HF repo ID.

Returns _DEFAULT_EMBED_MODEL when model_name is None or empty.