`nemo_automodel._transformers.retrieval`#

Encoder models for bi-encoder and cross-encoder tasks.

Module Contents#

Classes#

`BiEncoderModel`	Bi-encoder model that produces embeddings using a bidirectional backbone.
`CrossEncoderModel`	Cross-encoder model for scoring/classification tasks.

Functions#

`pool`	Pool hidden states using the specified pooling method.
`configure_encoder_metadata`	Configure HuggingFace consolidated checkpoint metadata on a model.
`build_encoder_backbone`	Build an encoder backbone from a pretrained checkpoint.
`save_encoder_pretrained`	Save an encoder model to an output directory.
`_init_encoder_common`	Shared init for BiEncoderModel and CrossEncoderModel.

Data#

`logger`
`_LLAMA_TASKS`
`SUPPORTED_BACKBONES`

API#

nemo_automodel._transformers.retrieval.logger#: ‘get_logger(…)’

nemo_automodel._transformers.retrieval.pool( last_hidden_states: torch.Tensor, attention_mask: torch.Tensor, pool_type: str, ) → torch.Tensor#

Pool hidden states using the specified pooling method.

Parameters:

last_hidden_states – Hidden states from the model [batch_size, seq_len, hidden_size]
attention_mask – Attention mask [batch_size, seq_len]
pool_type – Type of pooling to apply

Returns:

Pooled embeddings [batch_size, hidden_size]

nemo_automodel._transformers.retrieval.configure_encoder_metadata( model: transformers.PreTrainedModel, config, ) → None#

Configure HuggingFace consolidated checkpoint metadata on a model.

Sets config.architectures unconditionally. For custom retrieval architectures registered in :class:ModelRegistry, also writes config.auto_map so that the saved checkpoint can be reloaded via HuggingFace Auto classes. Standard HF models already have their own auto-resolution and do not need auto_map entries.

Parameters:

model – The backbone PreTrainedModel instance.
config – The model’s config object (typically model.config).

nemo_automodel._transformers.retrieval.build_encoder_backbone(

model_name_or_path: str,

task: str,

trust_remote_code: bool = False,

pooling: Optional[str] = None,

**hf_kwargs,

) → transformers.PreTrainedModel#

Build an encoder backbone from a pretrained checkpoint.

For model types listed in :data:SUPPORTED_BACKBONES, resolves the custom bidirectional architecture class from :class:ModelRegistry. For all other model types, falls back to AutoModel.from_pretrained (or AutoModelForSequenceClassification for the "score" task).

Parameters:

model_name_or_path – Path or HuggingFace Hub identifier.
task – The encoder task (e.g. "embedding", "score").
trust_remote_code – Whether to allow custom remote code.
pooling – Bi-encoder pooling strategy for registry backbones (e.g. Llama bidirectional) that accept it on from_pretrained. Must not be forwarded to standard HF models (e.g. Qwen3) loaded via AutoModel; those only receive **hf_kwargs.
**hf_kwargs – Extra keyword arguments forwarded to from_pretrained.

Returns:

The constructed PreTrainedModel backbone.

Raises:

ValueError – If the task is unsupported for a known model type, or the architecture class is missing from :class:ModelRegistry.

nemo_automodel._transformers.retrieval.save_encoder_pretrained(

model: torch.nn.Module,

save_directory: str,

**kwargs,

) → None#

Save an encoder model to an output directory.

If checkpointer is present in kwargs, delegates to Checkpointer.save_model for distributed/FSDP-safe saving. Otherwise falls back to the inner PreTrainedModel.save_pretrained.

The inner model is expected to be stored as model.model (the backbone wrapped by the encoder).

Parameters:

model – The encoder nn.Module (must have a .model attribute that is the PreTrainedModel backbone).
save_directory – Filesystem path where the checkpoint is written.
**kwargs –
Optional keys:
- checkpointer: a Checkpointer instance for distributed saves.
- peft_config: PEFT configuration (forwarded to checkpointer).
- tokenizer: tokenizer instance (forwarded to checkpointer).

nemo_automodel._transformers.retrieval._LLAMA_TASKS#: None

nemo_automodel._transformers.retrieval.SUPPORTED_BACKBONES#: None

nemo_automodel._transformers.retrieval._init_encoder_common( encoder: torch.nn.Module, model: transformers.PreTrainedModel, ) → None#: Shared init for BiEncoderModel and CrossEncoderModel.

class nemo_automodel._transformers.retrieval.BiEncoderModel( model: transformers.PreTrainedModel, pooling: str = 'avg', l2_normalize: bool = True, )#

Bases: torch.nn.Module

Bi-encoder model that produces embeddings using a bidirectional backbone.

Initialization

_TASK#: ‘embedding’

classmethod build(

model_name_or_path: str,

task: str = None,

pooling: str = 'avg',

l2_normalize: bool = True,

trust_remote_code: bool = False,

**hf_kwargs,

)#: Build bi-encoder model from a pretrained backbone.

save_pretrained(save_directory: str, **kwargs)#

encode(input_dict: dict) → Optional[torch.Tensor]#

Encode inputs and return pooled embeddings.

Parameters:: input_dict – Tokenized inputs (input_ids, attention_mask, etc.)
Returns:: Embeddings [batch_size, hidden_dim], or None if input_dict is empty.

forward(

input_dict: dict = None,

**kwargs,

) → Optional[torch.Tensor]#: Forward pass – going through call ensures FSDP2 unshard hooks fire.

class nemo_automodel._transformers.retrieval.CrossEncoderModel(model: transformers.PreTrainedModel)#

Bases: torch.nn.Module

Cross-encoder model for scoring/classification tasks.

Initialization

_TASK#: ‘score’

classmethod build(

model_name_or_path: str,

trust_remote_code: bool = False,

**hf_kwargs,

)#: Build cross-encoder model from a pretrained backbone.

save_pretrained(save_directory: str, **kwargs)#

forward(

input_dict: dict = None,

**kwargs,

) → Optional[torch.Tensor]#

nemo_automodel._transformers.retrieval#

Module Contents#

Classes#

Functions#

Data#

API#

`nemo_automodel._transformers.retrieval`#