nemo_automodel.components.models.deepseek_v32.state_dict_adapter

View as Markdown

State dict adapter for DeepSeek V3.2.

Extends DeepSeekV3StateDictAdapter with mappings for the new Indexer weights.

Module Contents

Classes

NameDescription
DeepSeekV32StateDictAdapterState dict adapter for DeepSeek V3.2.

API

class nemo_automodel.components.models.deepseek_v32.state_dict_adapter.DeepSeekV32StateDictAdapter()

Bases: DeepSeekV3StateDictAdapter

State dict adapter for DeepSeek V3.2.

_base_non_quantized_keys
_indexer_non_quantized_keys
_non_quantized_keys
list[str]

Get the full list of non-quantized keys including indexer keys.

nemo_automodel.components.models.deepseek_v32.state_dict_adapter.DeepSeekV32StateDictAdapter._add_quantization_scale_inv_tensors(
state_dict: dict[str, typing.Any]
) -> dict[str, typing.Any]

Add quantization scale tensors, handling indexer-specific keys.

nemo_automodel.components.models.deepseek_v32.state_dict_adapter.DeepSeekV32StateDictAdapter.convert_single_tensor_to_hf(
fqn: str,
tensor: typing.Any,
kwargs = {}
) -> list[tuple[str, typing.Any]]

Convert a single tensor from native format to HuggingFace format.

Handles both standard V3 tensors and V3.2 indexer tensors, ensuring indexer LayerNorm weights are not quantized.