nemo_automodel._transformers.retrieval#

Encoder models for bi-encoder and cross-encoder tasks.

Module Contents#

Classes#

BiEncoderModel

Bi-encoder model that produces embeddings using a bidirectional backbone.

CrossEncoderModel

Cross-encoder model for scoring/classification tasks.

Functions#

_extract_submodel

Extract a nested submodel from a loaded model using a dotted attribute path.

_get_supported_backbone_class

Return the registered retrieval backbone class for a model type and task.

_move_to_extracted_dtype

Move a newly-built model to the dtype used by the extracted model.

_load_from_extracted_state

Load a target backbone from an extracted model’s in-memory state dict.

_build_backbone_from_extracted_submodel

Build a task-specific retrieval backbone from an extracted text submodel.

pool

Pool hidden states using the specified pooling method.

configure_encoder_metadata

Configure HuggingFace consolidated checkpoint metadata on a model.

build_encoder_backbone

Build an encoder backbone from a pretrained checkpoint.

save_encoder_pretrained

Save an encoder model to an output directory.

_init_encoder_common

Shared init for BiEncoderModel and CrossEncoderModel.

Data#

API#

nemo_automodel._transformers.retrieval.logger#

‘get_logger(…)’

nemo_automodel._transformers.retrieval._extract_submodel(
model: torch.nn.Module,
extract_submodel: str,
) transformers.PreTrainedModel#

Extract a nested submodel from a loaded model using a dotted attribute path.

nemo_automodel._transformers.retrieval._get_supported_backbone_class(
model_type: str,
task: str,
) type[torch.nn.Module] | None#

Return the registered retrieval backbone class for a model type and task.

nemo_automodel._transformers.retrieval._move_to_extracted_dtype(
model: torch.nn.Module,
extracted_model: torch.nn.Module,
) torch.nn.Module#

Move a newly-built model to the dtype used by the extracted model.

nemo_automodel._transformers.retrieval._load_from_extracted_state(
backbone_class: type[transformers.PreTrainedModel],
config,
extracted_model: transformers.PreTrainedModel,
) transformers.PreTrainedModel#

Load a target backbone from an extracted model’s in-memory state dict.

nemo_automodel._transformers.retrieval._build_backbone_from_extracted_submodel(
extracted_model: transformers.PreTrainedModel,
task: str,
pooling: Optional[str],
num_labels: Optional[int],
temperature: Optional[float],
) transformers.PreTrainedModel#

Build a task-specific retrieval backbone from an extracted text submodel.

nemo_automodel._transformers.retrieval.pool(
last_hidden_states: torch.Tensor,
attention_mask: torch.Tensor,
pool_type: str,
) torch.Tensor#

Pool hidden states using the specified pooling method.

Parameters:
  • last_hidden_states – Hidden states from the model [batch_size, seq_len, hidden_size]

  • attention_mask – Attention mask [batch_size, seq_len]

  • pool_type – Type of pooling to apply

Returns:

Pooled embeddings [batch_size, hidden_size]

nemo_automodel._transformers.retrieval.configure_encoder_metadata(
model: transformers.PreTrainedModel,
config,
) None#

Configure HuggingFace consolidated checkpoint metadata on a model.

Sets config.architectures unconditionally. For custom retrieval architectures registered in :class:ModelRegistry, also writes config.auto_map so that the saved checkpoint can be reloaded via HuggingFace Auto classes. Standard HF models already have their own auto-resolution and do not need auto_map entries.

Parameters:
  • model – The backbone PreTrainedModel instance.

  • config – The model’s config object (typically model.config).

nemo_automodel._transformers.retrieval.build_encoder_backbone(
model_name_or_path: str,
task: str,
trust_remote_code: bool = False,
pooling: Optional[str] = None,
extract_submodel: Optional[str] = None,
num_labels: Optional[int] = None,
temperature: Optional[float] = None,
**hf_kwargs,
) transformers.PreTrainedModel#

Build an encoder backbone from a pretrained checkpoint.

When extract_submodel is set, loads the parent model with HuggingFace Auto classes and extracts the dotted path. For supported extracted text backbones, it then builds the registered retrieval class for the requested task (bidirectional base model for "embedding", sequence-classification wrapper for "score"). For unsupported extracted text backbones, it returns the extracted model for "embedding" and wraps it with AutoModelForSequenceClassification for "score".

Without extract_submodel, model types listed in

Data:

SUPPORTED_BACKBONES resolve to custom bidirectional classes from

Class:

ModelRegistry; all other model types fall back to HuggingFace Auto classes.

Parameters:
  • model_name_or_path – Path or HuggingFace Hub identifier.

  • task – The encoder task (e.g. "embedding", "score").

  • trust_remote_code – Whether to allow custom remote code.

  • pooling – Bi-encoder pooling strategy for registry backbones (e.g. Llama bidirectional) that accept it on from_pretrained. Must not be forwarded to standard HF models (e.g. Qwen3) loaded via AutoModel; those only receive **hf_kwargs.

  • extract_submodel – Dotted attribute path to extract from the loaded model (e.g. "language_model" to extract the text backbone from a VLM).

  • num_labels – Number of labels for reranking/classification backbones.

  • temperature – Optional retrieval score temperature for custom retrieval backbones.

  • **hf_kwargs – Extra keyword arguments forwarded to from_pretrained.

Returns:

The constructed PreTrainedModel backbone.

Raises:

ValueError – If the task is unsupported for a known model type, or the architecture class is missing from :class:ModelRegistry.

nemo_automodel._transformers.retrieval.save_encoder_pretrained(
model: torch.nn.Module,
save_directory: str,
**kwargs,
) None#

Save an encoder model to an output directory.

If checkpointer is present in kwargs, delegates to Checkpointer.save_model for distributed/FSDP-safe saving. Otherwise falls back to the inner PreTrainedModel.save_pretrained.

The inner model is expected to be stored as model.model (the backbone wrapped by the encoder).

Parameters:
  • model – The encoder nn.Module (must have a .model attribute that is the PreTrainedModel backbone).

  • save_directory – Filesystem path where the checkpoint is written.

  • **kwargs –

    Optional keys:

    • checkpointer: a Checkpointer instance for distributed saves.

    • peft_config: PEFT configuration (forwarded to checkpointer).

    • tokenizer: tokenizer instance (forwarded to checkpointer).

nemo_automodel._transformers.retrieval._LLAMA_TASKS#

None

nemo_automodel._transformers.retrieval._MINISTRAL3_BIDIREC_TASKS#

None

nemo_automodel._transformers.retrieval.SUPPORTED_BACKBONES#

None

nemo_automodel._transformers.retrieval._init_encoder_common(
encoder: torch.nn.Module,
model: transformers.PreTrainedModel,
) None#

Shared init for BiEncoderModel and CrossEncoderModel.

class nemo_automodel._transformers.retrieval.BiEncoderModel(
model: transformers.PreTrainedModel,
pooling: str = 'avg',
l2_normalize: bool = True,
do_distributed_inbatch_negative: bool = False,
)#

Bases: torch.nn.Module

Bi-encoder model that produces embeddings using a bidirectional backbone.

Initialization

_TASK#

‘embedding’

classmethod build(
model_name_or_path: str,
task: str = None,
pooling: str = 'avg',
l2_normalize: bool = True,
do_distributed_inbatch_negative: bool = False,
trust_remote_code: bool = False,
**hf_kwargs,
)#

Build bi-encoder model from a pretrained backbone.

save_pretrained(save_directory: str, **kwargs)#
encode(input_dict: dict) Optional[torch.Tensor]#

Encode inputs and return pooled embeddings.

Parameters:

input_dict – Tokenized inputs (input_ids, attention_mask, etc.)

Returns:

Embeddings [batch_size, hidden_dim], or None if input_dict is empty.

forward(
input_dict: dict = None,
**kwargs,
) Optional[torch.Tensor]#

Forward pass – going through call ensures FSDP2 unshard hooks fire.

class nemo_automodel._transformers.retrieval.CrossEncoderModel(model: transformers.PreTrainedModel)#

Bases: torch.nn.Module

Cross-encoder model for scoring/classification tasks.

Initialization

_TASK#

‘score’

classmethod build(
model_name_or_path: str,
trust_remote_code: bool = False,
**hf_kwargs,
)#

Build cross-encoder model from a pretrained backbone.

save_pretrained(save_directory: str, **kwargs)#
forward(
input_dict: dict = None,
**kwargs,
) Optional[torch.Tensor]#