> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.mimo_v2_flash.config

## Module Contents

### Classes

| Name                                                                                            | Description                                 |
| ----------------------------------------------------------------------------------------------- | ------------------------------------------- |
| [`MiMoV2FlashConfig`](#nemo_automodel-components-models-mimo_v2_flash-config-MiMoV2FlashConfig) | Configuration for XiaomiMiMo/MiMo-V2-Flash. |

### API

```python
class nemo_automodel.components.models.mimo_v2_flash.config.MiMoV2FlashConfig(
    vocab_size: int = 152576,
    hidden_size: int = 4096,
    intermediate_size: int = 16384,
    moe_intermediate_size: int = 2048,
    num_hidden_layers: int = 48,
    num_attention_heads: int = 64,
    num_key_value_heads: int = 4,
    head_dim: int = 192,
    v_head_dim: int = 128,
    swa_num_attention_heads: int = 64,
    swa_num_key_value_heads: int = 8,
    swa_head_dim: int = 192,
    swa_v_head_dim: int = 128,
    hidden_act: str = 'silu',
    max_position_embeddings: int = 262144,
    initializer_range: float = 0.02,
    layernorm_epsilon: float = 1e-05,
    rms_norm_eps: float | None = None,
    use_cache: bool = True,
    tie_word_embeddings: bool = False,
    rope_theta: float = 5000000.0,
    swa_rope_theta: float = 10000.0,
    rope_scaling: dict | None = None,
    attention_bias: bool = False,
    attention_dropout: float = 0.0,
    attention_value_scale: float | None = 0.707,
    add_full_attention_sink_bias: bool = False,
    add_swa_attention_sink_bias: bool = True,
    hybrid_block_size: int | None = None,
    hybrid_layer_pattern: list[int] | None = None,
    partial_rotary_factor: float = 0.334,
    sliding_window: int | None = 128,
    sliding_window_size: int | None = 128,
    attention_chunk_size: int | None = 128,
    n_routed_experts: int | None = 256,
    n_shared_experts: int | None = None,
    num_experts_per_tok: int = 8,
    scoring_func: str = 'sigmoid',
    topk_method: str = 'noaux_tc',
    n_group: int = 1,
    topk_group: int = 1,
    norm_topk_prob: bool = True,
    routed_scaling_factor: float | None = 1.0,
    moe_layer_freq: list[int] | None = None,
    torch_dtype: str = 'bfloat16',
    kwargs = {}
)
```

**Bases:** `PretrainedConfig`

Configuration for XiaomiMiMo/MiMo-V2-Flash.

The Hugging Face remote config class currently leaves `model_type` empty.
Automodel registers this local config with the hub's JSON `model_type` so
configs can resolve without executing remote code.