nemo_automodel.components.models.ling_v2.layers
nemo_automodel.components.models.ling_v2.layers
Attention layer for BailingMoeV2 (Ling 2.0).
GQA + per-head QK-RMSNorm + partial RoPE. Equivalent to Qwen3-MoE attention
with an additional partial_rotary_factor knob that rotates only the first
head_dim * partial_rotary_factor channels and passes the rest through
(GPT-J / GPT-NeoX half-RoPE).
Module Contents
Classes
API
Bases: Module
Bailing MoE V2 attention block.
head_dim
k_norm
k_proj
num_heads
num_kv_heads
o_proj
q_norm
q_proj
use_qk_norm
v_proj