nemo_automodel._transformers.retrieval#
Encoder models for bi-encoder and cross-encoder tasks.
Module Contents#
Classes#
Bi-encoder model that produces embeddings using a bidirectional backbone. |
|
Cross-encoder model for scoring/classification tasks. |
Functions#
Extract a nested submodel from a loaded model using a dotted attribute path. |
|
Return the registered retrieval backbone class for a model type and task. |
|
Move a newly-built model to the dtype used by the extracted model. |
|
Load a target backbone from an extracted model’s in-memory state dict. |
|
Build a task-specific retrieval backbone from an extracted text submodel. |
|
Pool hidden states using the specified pooling method. |
|
Configure HuggingFace consolidated checkpoint metadata on a model. |
|
Build an encoder backbone from a pretrained checkpoint. |
|
Save an encoder model to an output directory. |
|
Shared init for BiEncoderModel and CrossEncoderModel. |
Data#
API#
- nemo_automodel._transformers.retrieval.logger#
‘get_logger(…)’
- nemo_automodel._transformers.retrieval._extract_submodel(
- model: torch.nn.Module,
- extract_submodel: str,
Extract a nested submodel from a loaded model using a dotted attribute path.
- nemo_automodel._transformers.retrieval._get_supported_backbone_class(
- model_type: str,
- task: str,
Return the registered retrieval backbone class for a model type and task.
- nemo_automodel._transformers.retrieval._move_to_extracted_dtype(
- model: torch.nn.Module,
- extracted_model: torch.nn.Module,
Move a newly-built model to the dtype used by the extracted model.
- nemo_automodel._transformers.retrieval._load_from_extracted_state(
- backbone_class: type[transformers.PreTrainedModel],
- config,
- extracted_model: transformers.PreTrainedModel,
Load a target backbone from an extracted model’s in-memory state dict.
- nemo_automodel._transformers.retrieval._build_backbone_from_extracted_submodel(
- extracted_model: transformers.PreTrainedModel,
- task: str,
- pooling: Optional[str],
- num_labels: Optional[int],
- temperature: Optional[float],
Build a task-specific retrieval backbone from an extracted text submodel.
- nemo_automodel._transformers.retrieval.pool(
- last_hidden_states: torch.Tensor,
- attention_mask: torch.Tensor,
- pool_type: str,
Pool hidden states using the specified pooling method.
- Parameters:
last_hidden_states – Hidden states from the model [batch_size, seq_len, hidden_size]
attention_mask – Attention mask [batch_size, seq_len]
pool_type – Type of pooling to apply
- Returns:
Pooled embeddings [batch_size, hidden_size]
- nemo_automodel._transformers.retrieval.configure_encoder_metadata(
- model: transformers.PreTrainedModel,
- config,
Configure HuggingFace consolidated checkpoint metadata on a model.
Sets
config.architecturesunconditionally. For custom retrieval architectures registered in :class:ModelRegistry, also writesconfig.auto_mapso that the saved checkpoint can be reloaded via HuggingFace Auto classes. Standard HF models already have their own auto-resolution and do not needauto_mapentries.- Parameters:
model – The backbone
PreTrainedModelinstance.config – The model’s config object (typically
model.config).
- nemo_automodel._transformers.retrieval.build_encoder_backbone(
- model_name_or_path: str,
- task: str,
- trust_remote_code: bool = False,
- pooling: Optional[str] = None,
- extract_submodel: Optional[str] = None,
- num_labels: Optional[int] = None,
- temperature: Optional[float] = None,
- **hf_kwargs,
Build an encoder backbone from a pretrained checkpoint.
When
extract_submodelis set, loads the parent model with HuggingFace Auto classes and extracts the dotted path. For supported extracted text backbones, it then builds the registered retrieval class for the requested task (bidirectional base model for"embedding", sequence-classification wrapper for"score"). For unsupported extracted text backbones, it returns the extracted model for"embedding"and wraps it withAutoModelForSequenceClassificationfor"score".Without
extract_submodel, model types listed in- Data:
SUPPORTED_BACKBONESresolve to custom bidirectional classes from- Class:
ModelRegistry; all other model types fall back to HuggingFace Auto classes.- Parameters:
model_name_or_path – Path or HuggingFace Hub identifier.
task – The encoder task (e.g.
"embedding","score").trust_remote_code – Whether to allow custom remote code.
pooling – Bi-encoder pooling strategy for registry backbones (e.g. Llama bidirectional) that accept it on
from_pretrained. Must not be forwarded to standard HF models (e.g. Qwen3) loaded viaAutoModel; those only receive**hf_kwargs.extract_submodel – Dotted attribute path to extract from the loaded model (e.g.
"language_model"to extract the text backbone from a VLM).num_labels – Number of labels for reranking/classification backbones.
temperature – Optional retrieval score temperature for custom retrieval backbones.
**hf_kwargs – Extra keyword arguments forwarded to
from_pretrained.
- Returns:
The constructed
PreTrainedModelbackbone.- Raises:
ValueError – If the task is unsupported for a known model type, or the architecture class is missing from :class:
ModelRegistry.
- nemo_automodel._transformers.retrieval.save_encoder_pretrained(
- model: torch.nn.Module,
- save_directory: str,
- **kwargs,
Save an encoder model to an output directory.
If
checkpointeris present in kwargs, delegates toCheckpointer.save_modelfor distributed/FSDP-safe saving. Otherwise falls back to the innerPreTrainedModel.save_pretrained.The inner model is expected to be stored as
model.model(the backbone wrapped by the encoder).- Parameters:
model – The encoder
nn.Module(must have a.modelattribute that is thePreTrainedModelbackbone).save_directory – Filesystem path where the checkpoint is written.
**kwargs –
Optional keys:
checkpointer: a Checkpointer instance for distributed saves.peft_config: PEFT configuration (forwarded to checkpointer).tokenizer: tokenizer instance (forwarded to checkpointer).
- nemo_automodel._transformers.retrieval._LLAMA_TASKS#
None
- nemo_automodel._transformers.retrieval._MINISTRAL3_BIDIREC_TASKS#
None
- nemo_automodel._transformers.retrieval.SUPPORTED_BACKBONES#
None
- nemo_automodel._transformers.retrieval._init_encoder_common(
- encoder: torch.nn.Module,
- model: transformers.PreTrainedModel,
Shared init for BiEncoderModel and CrossEncoderModel.
- class nemo_automodel._transformers.retrieval.BiEncoderModel(
- model: transformers.PreTrainedModel,
- pooling: str = 'avg',
- l2_normalize: bool = True,
- do_distributed_inbatch_negative: bool = False,
Bases:
torch.nn.ModuleBi-encoder model that produces embeddings using a bidirectional backbone.
Initialization
- _TASK#
‘embedding’
- classmethod build(
- model_name_or_path: str,
- task: str = None,
- pooling: str = 'avg',
- l2_normalize: bool = True,
- do_distributed_inbatch_negative: bool = False,
- trust_remote_code: bool = False,
- **hf_kwargs,
Build bi-encoder model from a pretrained backbone.
- save_pretrained(save_directory: str, **kwargs)#
- encode(input_dict: dict) Optional[torch.Tensor]#
Encode inputs and return pooled embeddings.
- Parameters:
input_dict – Tokenized inputs (input_ids, attention_mask, etc.)
- Returns:
Embeddings [batch_size, hidden_dim], or None if input_dict is empty.
- forward(
- input_dict: dict = None,
- **kwargs,
Forward pass – going through call ensures FSDP2 unshard hooks fire.
- class nemo_automodel._transformers.retrieval.CrossEncoderModel(model: transformers.PreTrainedModel)#
Bases:
torch.nn.ModuleCross-encoder model for scoring/classification tasks.
Initialization
- _TASK#
‘score’
- classmethod build(
- model_name_or_path: str,
- trust_remote_code: bool = False,
- **hf_kwargs,
Build cross-encoder model from a pretrained backbone.
- save_pretrained(save_directory: str, **kwargs)#
- forward(
- input_dict: dict = None,
- **kwargs,