nemo_automodel._transformers.retrieval#

Encoder models for bi-encoder and cross-encoder tasks.

Module Contents#

Classes#

BiEncoderModel

Bi-encoder model that produces embeddings using a bidirectional backbone.

CrossEncoderModel

Cross-encoder model for scoring/classification tasks.

Functions#

pool

Pool hidden states using the specified pooling method.

configure_encoder_metadata

Configure HuggingFace consolidated checkpoint metadata on a model.

build_encoder_backbone

Build an encoder backbone from a pretrained checkpoint.

save_encoder_pretrained

Save an encoder model to an output directory.

_init_encoder_common

Shared init for BiEncoderModel and CrossEncoderModel.

Data#

API#

nemo_automodel._transformers.retrieval.logger#

‘get_logger(…)’

nemo_automodel._transformers.retrieval.pool(
last_hidden_states: torch.Tensor,
attention_mask: torch.Tensor,
pool_type: str,
) torch.Tensor#

Pool hidden states using the specified pooling method.

Parameters:
  • last_hidden_states – Hidden states from the model [batch_size, seq_len, hidden_size]

  • attention_mask – Attention mask [batch_size, seq_len]

  • pool_type – Type of pooling to apply

Returns:

Pooled embeddings [batch_size, hidden_size]

nemo_automodel._transformers.retrieval.configure_encoder_metadata(
model: transformers.PreTrainedModel,
config,
) None#

Configure HuggingFace consolidated checkpoint metadata on a model.

Sets config.architectures unconditionally. For custom retrieval architectures registered in :class:ModelRegistry, also writes config.auto_map so that the saved checkpoint can be reloaded via HuggingFace Auto classes. Standard HF models already have their own auto-resolution and do not need auto_map entries.

Parameters:
  • model – The backbone PreTrainedModel instance.

  • config – The model’s config object (typically model.config).

nemo_automodel._transformers.retrieval.build_encoder_backbone(
model_name_or_path: str,
task: str,
trust_remote_code: bool = False,
**hf_kwargs,
) transformers.PreTrainedModel#

Build an encoder backbone from a pretrained checkpoint.

For model types listed in :data:SUPPORTED_BACKBONES, resolves the custom bidirectional architecture class from :class:ModelRegistry. For all other model types, falls back to AutoModel.from_pretrained (or AutoModelForSequenceClassification for the "score" task).

Parameters:
  • model_name_or_path – Path or HuggingFace Hub identifier.

  • task – The encoder task (e.g. "embedding", "score").

  • trust_remote_code – Whether to allow custom remote code.

  • **hf_kwargs – Extra keyword arguments forwarded to from_pretrained.

Returns:

The constructed PreTrainedModel backbone.

Raises:

ValueError – If the task is unsupported for a known model type, or the architecture class is missing from :class:ModelRegistry.

nemo_automodel._transformers.retrieval.save_encoder_pretrained(
model: torch.nn.Module,
save_directory: str,
**kwargs,
) None#

Save an encoder model to an output directory.

If checkpointer is present in kwargs, delegates to Checkpointer.save_model for distributed/FSDP-safe saving. Otherwise falls back to the inner PreTrainedModel.save_pretrained.

The inner model is expected to be stored as model.model (the backbone wrapped by the encoder).

Parameters:
  • model – The encoder nn.Module (must have a .model attribute that is the PreTrainedModel backbone).

  • save_directory – Filesystem path where the checkpoint is written.

  • **kwargs –

    Optional keys:

    • checkpointer: a Checkpointer instance for distributed saves.

    • peft_config: PEFT configuration (forwarded to checkpointer).

    • tokenizer: tokenizer instance (forwarded to checkpointer).

nemo_automodel._transformers.retrieval._LLAMA_TASKS#

None

nemo_automodel._transformers.retrieval.SUPPORTED_BACKBONES#

None

nemo_automodel._transformers.retrieval._init_encoder_common(
encoder: torch.nn.Module,
model: transformers.PreTrainedModel,
) None#

Shared init for BiEncoderModel and CrossEncoderModel.

class nemo_automodel._transformers.retrieval.BiEncoderModel(
model: transformers.PreTrainedModel,
pooling: str = 'avg',
l2_normalize: bool = True,
)#

Bases: torch.nn.Module

Bi-encoder model that produces embeddings using a bidirectional backbone.

Initialization

_TASK#

‘embedding’

classmethod build(
model_name_or_path: str,
task: str = None,
pooling: str = 'avg',
l2_normalize: bool = True,
trust_remote_code: bool = False,
**hf_kwargs,
)#

Build bi-encoder model from a pretrained backbone.

save_pretrained(save_directory: str, **kwargs)#
encode(input_dict: dict) Optional[torch.Tensor]#

Encode inputs and return pooled embeddings.

Parameters:

input_dict – Tokenized inputs (input_ids, attention_mask, etc.)

Returns:

Embeddings [batch_size, hidden_dim], or None if input_dict is empty.

forward(
input_dict: dict = None,
**kwargs,
) Optional[torch.Tensor]#

Forward pass – going through call ensures FSDP2 unshard hooks fire.

class nemo_automodel._transformers.retrieval.CrossEncoderModel(model: transformers.PreTrainedModel)#

Bases: torch.nn.Module

Cross-encoder model for scoring/classification tasks.

Initialization

_TASK#

‘score’

classmethod build(
model_name_or_path: str,
trust_remote_code: bool = False,
**hf_kwargs,
)#

Build cross-encoder model from a pretrained backbone.

save_pretrained(save_directory: str, **kwargs)#
forward(
input_dict: dict = None,
**kwargs,
) Optional[torch.Tensor]#