nemo_retriever.model package#
Subpackages#
- nemo_retriever.model.local package
- Submodules
- nemo_retriever.model.local.llama_nemotron_embed_1b_v2_embedder module
- nemo_retriever.model.local.llama_nemotron_embed_1b_v2_hf_embedder module
- nemo_retriever.model.local.llama_nemotron_embed_vl_1b_v2_embedder module
- nemo_retriever.model.local.nemotron_graphic_elements_v1 module
- nemo_retriever.model.local.nemotron_ocr_v1 module
- nemo_retriever.model.local.nemotron_ocr_v2 module
- nemo_retriever.model.local.nemotron_page_elements_v3 module
- nemo_retriever.model.local.nemotron_parse_v1_2 module
- nemo_retriever.model.local.nemotron_rerank_v2 module
- nemo_retriever.model.local.nemotron_rerank_vl_v2 module
- nemo_retriever.model.local.nemotron_rerank_vl_v2_hf module
- nemo_retriever.model.local.nemotron_table_structure_v1 module
- nemo_retriever.model.local.nemotron_vlm_captioner module
- nemo_retriever.model.local.parakeet_ctc_1_1b_asr module
- Module contents
Submodules#
nemo_retriever.model.model module#
- class nemo_retriever.model.model.BaseModel[source]#
Bases:
ABCAbstract base class for all models.
This class must never be instantiated directly.
- abstract property input: Any#
Input schema or object.
- abstract property input_batch_size: int#
Maximum or default input batch size.
- abstract property model_name: str#
Human-readable model name.
- abstract property model_runmode: Literal['local', 'NIM', 'build-endpoint']#
local, NIM, or build-endpoint.
- Type:
Execution mode
- abstract property model_type: str#
Model category/type (e.g. llm, vision, embedding).
- abstract property output: Any#
Output schema or object.
- class nemo_retriever.model.model.HuggingFaceModel(model_id: str)[source]#
Bases:
BaseModelAbstract base class for all HuggingFace models.
- abstract property input_shape: Tuple[int, int]#
Input shape.
- abstract property model: nn.Module#
Model instance.
- abstract property model_dir: str#
Model directory.
Module contents#
- nemo_retriever.model.create_local_embedder(
- model_name: str | None = None,
- *,
- backend: str = 'vllm',
- device: str | None = None,
- hf_cache_dir: str | None = None,
- gpu_memory_utilization: float = 0.45,
- enforce_eager: bool = False,
- dimensions: int | None = None,
- normalize: bool = True,
- max_length: int = 8192,
- query_max_length: int = 128,
Create the appropriate local embedding model (VL or non-VL).
backend must be
"vllm"or"hf".For non-VL models:
backend="vllm"(default): vLLM viaLlamaNemotronEmbed1BV2Embedder.backend="hf": HuggingFace viaLlamaNemotronEmbed1BV2HFEmbedder.
For VL models:
backend="vllm"(default): vLLM viaLlamaNemotronEmbedVL1BV2VLLMEmbedder.backend="hf": HuggingFace viaLlamaNemotronEmbedVL1BV2Embedder.
deviceapplies only to HuggingFace paths. For vLLM paths,deviceis forwarded for compatibility but deprecated and ignored (vLLM placement is process-level); passing it emitsDeprecationWarning.Note:
gpu_memory_utilization,enforce_eager,dimensions,normalize, andmax_lengthapply to vLLM paths only; the HF VL path ignores them.
- nemo_retriever.model.create_local_query_embedder(
- model_name: str | None = None,
- *,
- backend: str = 'hf',
- device: str | None = None,
- hf_cache_dir: str | None = None,
- gpu_memory_utilization: float = 0.45,
- enforce_eager: bool = False,
- dimensions: int | None = None,
- normalize: bool = True,
- max_length: int = 8192,
- query_max_length: int = 128,
Create a local embedder for query vectors in retrieval (Retriever / recall).
backend must be
"hf"(default) or"vllm".backend="hf": HuggingFace for both VL and non-VL models.backend="vllm": vLLM for both VL and non-VL models.
- nemo_retriever.model.create_local_reranker(
- model_name: str | None = None,
- *,
- device: str | None = None,
- hf_cache_dir: str | None = None,
- backend: str = 'vllm',
- gpu_memory_utilization: float = 0.5,
Create the appropriate local reranker model (VL or text-only).
Dispatches to
NemotronRerankVLV2VLLM(default) orNemotronRerankVLV2when model_name matches a VL reranker ID, depending on backend. Otherwise returns the text-onlyNemotronRerankV2.- Parameters:
backend –
"vllm"(default) uses vLLM’s pooling runner for the VL reranker."hf"uses HuggingFaceAutoModelForSequenceClassification. Only affects VL reranker dispatch; the text-only reranker always uses HuggingFace.gpu_memory_utilization – Fraction of GPU memory for the vLLM engine (only used when backend is
"vllm").
- nemo_retriever.model.is_vl_embed_model(model_name: str | None) bool[source]#
Return True if model_name refers to the VL embedding model.
- nemo_retriever.model.is_vl_rerank_model(model_name: str | None) bool[source]#
Return True if model_name refers to the VL reranker model.
- nemo_retriever.model.normalize_backend(
- value: str | None,
- valid: frozenset[str],
- *,
- field_name: str,
- default: str,
Normalize value (strip + lowercase) and validate against valid.
Raises
ValueErrorreferencing field_name on invalid input. Falsy value is replaced by default before validation.