nemo_automodel.components.models.baichuan.model
nemo_automodel.components.models.baichuan.model
Native Baichuan2 model implementation for NeMo Automodel.
Adapted from the Baichuan2 remote-code model on HuggingFace with the following changes:
- Removed xformers / quantization / chat / streaming dependencies.
- Added
**kwargsto forward signatures so that extra batch keys (padding_mask,loss_mask, …) pass through without error. - Uses
HFCheckpointingMixinfor unified checkpointing. - Uses
torch.nn.functional.scaled_dot_product_attentiononly.
Example (YAML)::
model: target: nemo_automodel.NeMoAutoModelForCausalLM.from_pretrained pretrained_model_name_or_path: baichuan-inc/Baichuan2-7B-Chat
Module Contents
Classes
Functions
Data
API
Bases: Module
W_pack
head_dim
hidden_size
max_position_embeddings
num_heads
o_proj
rotary_emb
Bases: HFCheckpointingMixin, BaichuanPreTrainedModel, GenerationMixin
_tied_weights_keys
lm_head
model
staticmethod
Bases: PreTrainedModel
_no_split_modules
base_model_prefix
Bases: Module
hidden_size
input_layernorm
mlp
post_attention_layernorm
self_attn
Bases: Module
act_fn
down_proj
gate_proj
up_proj
Bases: Module
weight
Bases: Module
weight
Bases: Module
cos_cached
inv_freq
sin_cached