> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter

## Module Contents

### Classes

| Name                                                                                                                      | Description                                   |
| ------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------- |
| [`KimiK25VLStateDictAdapter`](#nemo_automodel-components-models-kimi_k25_vl-state_dict_adapter-KimiK25VLStateDictAdapter) | State dict adapter for KimiK25VL checkpoints. |

### Functions

| Name                                                                                                    | Description                                              |
| ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------- |
| [`dequantize_int4`](#nemo_automodel-components-models-kimi_k25_vl-state_dict_adapter-dequantize_int4)   | Dequantize INT4 packed weights to bfloat16.              |
| [`quantize_to_int4`](#nemo_automodel-components-models-kimi_k25_vl-state_dict_adapter-quantize_to_int4) | Quantize bfloat16/float16 weights to INT4 packed format. |

### Data

[`LOGGER`](#nemo_automodel-components-models-kimi_k25_vl-state_dict_adapter-LOGGER)

### API

```python
class nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.KimiK25VLStateDictAdapter(
    config,
    moe_config: nemo_automodel.components.moe.config.MoEConfig,
    backend: nemo_automodel.components.models.common.BackendConfig,
    dtype: torch.dtype = torch.float32
)
```

**Bases:** [MoESplitExpertsStateDictMixin](/nemo-automodel/nemo_automodel/components/moe/state_dict_mixin#nemo_automodel-components-moe-state_dict_mixin-MoESplitExpertsStateDictMixin), [StateDictAdapter](/nemo-automodel/nemo_automodel/components/checkpoint/state_dict_adapter#nemo_automodel-components-checkpoint-state_dict_adapter-StateDictAdapter)

State dict adapter for KimiK25VL checkpoints.

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.KimiK25VLStateDictAdapter._expand_quantized_keys(
    state_dict: dict
) -> dict
```

Expand expert 'weight' keys to INT4 triplets: *\_packed/*\_scale/\*\_shape.

MoE expert weights are known to be INT4 quantized in the HF checkpoint.

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.KimiK25VLStateDictAdapter._is_quantized_expert_key(
    key: str
) -> bool
```

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.KimiK25VLStateDictAdapter.convert_single_tensor_to_hf(
    fqn: str,
    tensor: typing.Any,
    kwargs = {}
) -> list[tuple[str, typing.Any]]
```

Convert a single tensor from native format to HuggingFace format.

**Parameters:**

Fully qualified name of the tensor in native format

The tensor to convert

Additional arguments for conversion

**Returns:** `list[tuple[str, Any]]`

List of (fqn, tensor) tuples in HuggingFace format

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.KimiK25VLStateDictAdapter.from_hf(
    state_dict: dict,
    kwargs = {}
) -> dict
```

Convert HF checkpoint state dict to model format.

This handles INT4 dequantization: *\_packed/*\_scale/\*\_shape -> weight

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.KimiK25VLStateDictAdapter.to_hf(
    state_dict: dict[str, typing.Any],
    exclude_key_regex: typing.Optional[str] = None,
    quantization: bool = False,
    kwargs = {}
) -> dict[str, typing.Any]
```

Convert from native model state dict to HuggingFace format.

If quantization=True, expert weights are quantized to INT4.

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.dequantize_int4(
    weight_packed: torch.Tensor,
    weight_scale: torch.Tensor,
    weight_shape: torch.Tensor,
    group_size: int = 32,
    device: str = 'cuda'
) -> torch.Tensor
```

Dequantize INT4 packed weights to bfloat16.

Extracts local tensors from DTensors before unpacking (bitwise ops don't work on DTensor).
Both weight\_packed and weight\_scale should have matching sharding so .to\_local() gives
corresponding slices automatically.

**Parameters:**

INT4 packed weights \[out\_features, in\_features // 8], may be DTensor

Per-group scales \[out\_features, num\_groups], should be DTensor with same sharding

Original shape \[2], stores global dimensions

Elements per scale group (default 32)

Target device for computation

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.quantize_to_int4(
    weight: torch.Tensor,
    group_size: int = 32
) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]
```

Quantize bfloat16/float16 weights to INT4 packed format.

**Returns:** `torch.Tensor`

INT4 values packed into int32 (8 values per int32)

```python
nemo_automodel.components.models.kimi_k25_vl.state_dict_adapter.LOGGER = logging.getLogger(__name__)
```