> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.qwen3_vl_moe.state_dict_adapter

## Module Contents

### Classes

| Name                                                                                                                         | Description                                                                     |
| ---------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------- |
| [`Qwen3VLMoeStateDictAdapter`](#nemo_automodel-components-models-qwen3_vl_moe-state_dict_adapter-Qwen3VLMoeStateDictAdapter) | Converts between HF Qwen3-VL-MoE checkpoints and grouped-experts native format. |

### API

```python
class nemo_automodel.components.models.qwen3_vl_moe.state_dict_adapter.Qwen3VLMoeStateDictAdapter(
    config: typing.Any,
    moe_config: nemo_automodel.components.moe.config.MoEConfig,
    backend: nemo_automodel.components.models.common.BackendConfig,
    dtype: torch.dtype = torch.float32
)
```

**Bases:** [StateDictAdapter](/nemo-automodel/nemo_automodel/components/checkpoint/state_dict_adapter#nemo_automodel-components-checkpoint-state_dict_adapter-StateDictAdapter)

Converts between HF Qwen3-VL-MoE checkpoints and grouped-experts native format.

HF checkpoint keys (already stacked, no .weight suffix):
model.language\_model.layers.\{L}.mlp.experts.gate\_up\_proj  \[n\_experts, dim, 2\*inter]
model.language\_model.layers.\{L}.mlp.experts.down\_proj     \[n\_experts, inter, dim]

Native format (identical shapes, different key names):
model.language\_model.layers.\{L}.mlp.experts.gate\_and\_up\_projs
model.language\_model.layers.\{L}.mlp.experts.down\_projs

```python
nemo_automodel.components.models.qwen3_vl_moe.state_dict_adapter.Qwen3VLMoeStateDictAdapter.convert_single_tensor_to_hf(
    fqn: str,
    tensor: typing.Any,
    kwargs = {}
) -> list[tuple[str, typing.Any]]
```

Rename a single native key to HF format. Tensor passed through as-is.

```python
nemo_automodel.components.models.qwen3_vl_moe.state_dict_adapter.Qwen3VLMoeStateDictAdapter.from_hf(
    hf_state_dict: dict[str, typing.Any],
    device_mesh: typing.Optional[torch.distributed.device_mesh.DeviceMesh] = None,
    kwargs = {}
) -> dict[str, typing.Any]
```

Rename HF keys to native keys.

DTensors (DCP path): just rename, no tensor ops.
Plain tensors (init path): slice to local EP shard, create DTensor.

```python
nemo_automodel.components.models.qwen3_vl_moe.state_dict_adapter.Qwen3VLMoeStateDictAdapter.to_hf(
    state_dict: dict[str, typing.Any],
    exclude_key_regex: typing.Optional[str] = None,
    quantization: bool = False,
    kwargs = {}
) -> dict[str, typing.Any]
```

Rename native keys to HF keys. Tensors passed through as-is (no comms).