nemo_automodel.components.models.llama_bidirectional.model

View as Markdown

Llama Bidirectional model for embedding and retrieval tasks.

This module provides a bidirectional variant of Llama that is auto-discovered by the ModelRegistry via the ModelClass export.

To add support for other backbones (e.g., Qwen2, Mistral), create a similar module in a new directory (e.g., qwen2_bidirectional/) with its own ModelClass export.

Module Contents

Classes

NameDescription
LlamaBidirectionalConfigConfiguration class for LlamaBidirectionalModel.
LlamaBidirectionalForSequenceClassificationLlama Bidirectional Model with a sequence classification/regression head.
LlamaBidirectionalModelLlama Model with bidirectional attention.

Functions

NameDescription
_poolPool hidden states using the specified pooling method.
_register_with_hf_auto_classesRegister bidirectional models with HuggingFace Auto classes.
check_model_inputs-

Data

ModelClass

__all__

API

class nemo_automodel.components.models.llama_bidirectional.model.LlamaBidirectionalConfig(
pooling: str = 'avg',
temperature: float = 1.0,
kwargs = {}
)

Bases: LlamaConfig

Configuration class for LlamaBidirectionalModel.

Extends LlamaConfig with additional parameters for bidirectional attention and pooling configurations.

model_type
= 'llama_bidirec'
class nemo_automodel.components.models.llama_bidirectional.model.LlamaBidirectionalForSequenceClassification(
config
)

Bases: LlamaPreTrainedModel

Llama Bidirectional Model with a sequence classification/regression head.

This model adds a classification head on top of the bidirectional Llama model and includes configurable pooling strategies.

model
= LlamaBidirectionalModel(config)
num_labels
= config.num_labels
score
nemo_automodel.components.models.llama_bidirectional.model.LlamaBidirectionalForSequenceClassification._init_weights(
module
)
nemo_automodel.components.models.llama_bidirectional.model.LlamaBidirectionalForSequenceClassification.forward(
input_ids: typing.Optional[torch.LongTensor] = None,
attention_mask: typing.Optional[torch.Tensor] = None,
position_ids: typing.Optional[torch.LongTensor] = None,
past_key_values: typing.Optional[typing.Union[transformers.cache_utils.Cache, typing.List[torch.FloatTensor]]] = None,
inputs_embeds: typing.Optional[torch.FloatTensor] = None,
labels: typing.Optional[torch.LongTensor] = None,
use_cache: typing.Optional[bool] = None,
output_attentions: typing.Optional[bool] = None,
output_hidden_states: typing.Optional[bool] = None,
return_dict: typing.Optional[bool] = None,
kwargs = {}
) -> typing.Union[typing.Tuple, transformers.modeling_outputs.SequenceClassifierOutputWithPast]
class nemo_automodel.components.models.llama_bidirectional.model.LlamaBidirectionalModel(
config: transformers.models.llama.configuration_llama.LlamaConfig
)

Bases: LlamaModel

Llama Model with bidirectional attention.

This model removes causal masking from all attention layers, allowing tokens to attend to all other tokens in the sequence. This is useful for embedding and retrieval tasks where bidirectional context is beneficial.

The model is auto-discovered by ModelRegistry via the ModelClass export, enabling it to be loaded via NeMoAutoModelBiEncoder.from_pretrained().

nemo_automodel.components.models.llama_bidirectional.model.LlamaBidirectionalModel.forward(
input_ids: typing.Optional[torch.LongTensor] = None,
attention_mask: typing.Optional[torch.Tensor] = None,
position_ids: typing.Optional[torch.LongTensor] = None,
past_key_values: typing.Optional[transformers.cache_utils.Cache] = None,
inputs_embeds: typing.Optional[torch.FloatTensor] = None,
cache_position: typing.Optional[torch.LongTensor] = None,
use_cache: typing.Optional[bool] = None,
kwargs: transformers.processing_utils.Unpack[transformers.utils.TransformersKwargs] = {}
) -> transformers.modeling_outputs.BaseModelOutputWithPast
nemo_automodel.components.models.llama_bidirectional.model._pool(
last_hidden_states: torch.Tensor,
attention_mask: torch.Tensor,
pool_type: str
) -> torch.Tensor

Pool hidden states using the specified pooling method.

nemo_automodel.components.models.llama_bidirectional.model._register_with_hf_auto_classes()

Register bidirectional models with HuggingFace Auto classes.

This is needed so that AutoModel.from_config(LlamaBidirectionalConfig) works inside LlamaForSequenceClassification.init.

nemo_automodel.components.models.llama_bidirectional.model.check_model_inputs(
func
)
nemo_automodel.components.models.llama_bidirectional.model.ModelClass = [LlamaBidirectionalModel, LlamaBidirectionalForSequenceClassification]
nemo_automodel.components.models.llama_bidirectional.model.__all__ = ['LlamaBidirectionalModel', 'LlamaBidirectionalConfig', 'LlamaBidirectionalForSe...