> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.ling_v2.config

Configuration for BailingMoeV2 (Ling 2.0 family: Ling-mini, Ling-flash, Ling-1T).

Mirrors the `BailingMoeV2Config` shipped in the official HuggingFace checkpoints'
`configuration_bailing_moe_v2.py`.  Registered against `AutoConfig` so that
`AutoConfig.from_pretrained(...)` resolves without `trust_remote_code`.

## Module Contents

### Classes

| Name                                                                                        | Description                                                |
| ------------------------------------------------------------------------------------------- | ---------------------------------------------------------- |
| [`BailingMoeV2Config`](#nemo_automodel-components-models-ling_v2-config-BailingMoeV2Config) | Configuration class for the BailingMoeV2 model (Ling 2.0). |

### API

```python
class nemo_automodel.components.models.ling_v2.config.BailingMoeV2Config(
    vocab_size: int = 157184,
    hidden_size: int = 2048,
    intermediate_size: int = 5120,
    num_hidden_layers: int = 20,
    num_attention_heads: int = 16,
    num_key_value_heads: int = 4,
    hidden_act: str = 'silu',
    use_qkv_bias: bool = False,
    use_bias: bool = False,
    rms_norm_eps: float = 1e-06,
    tie_word_embeddings: bool = False,
    embedding_dropout: float = 0.0,
    attention_dropout: float = 0.0,
    output_dropout: float = 0.0,
    initializer_range: float = 0.02,
    max_position_embeddings: int = 32768,
    rope_theta: float = 600000.0,
    use_cache: bool = True,
    max_window_layers: int = 20,
    rope_scaling: dict | None = None,
    pad_token_id: int = 156892,
    eos_token_id: int = 156892,
    num_experts: int = 256,
    num_shared_experts: int = 1,
    num_experts_per_tok: int = 8,
    n_group: int = 8,
    topk_group: int = 4,
    moe_intermediate_size: int = 512,
    first_k_dense_replace: int = 1,
    head_dim: int = 128,
    output_router_logits: bool = False,
    use_qk_norm: bool = True,
    partial_rotary_factor: float = 1.0,
    num_nextn_predict_layers: int = 0,
    mtp_loss_scaling_factor: float = 0,
    moe_router_enable_expert_bias: bool = True,
    routed_scaling_factor: float = 1.0,
    norm_topk_prob: bool = True,
    score_function: str = 'sigmoid',
    rotary_dim: int | None = None,
    kwargs = {}
)
```

**Bases:** `PretrainedConfig`

Configuration class for the BailingMoeV2 model (Ling 2.0).

The defaults reflect the `Ling-mini-2.0` (16B-A1.4B) variant.  Larger
variants (`Ling-flash-2.0` 100B-A6B and `Ling-1T` 1T-A50B) override
sizing knobs but share the same architecture: GQA attention with per-head
QK-RMSNorm, partial RoPE, sigmoid-routed grouped MoE with shared experts,
and `first_k_dense_replace` dense MLP layers at the start.