nemo_automodel.components.models.deepseek_v3.state_dict_adapter
nemo_automodel.components.models.deepseek_v3.state_dict_adapter
Module Contents
Classes
Functions
Data
API
Bases: MoESplitExpertsStateDictMixin, StateDictAdapter
Convert a single tensor from native format to HuggingFace format.
Parameters:
Fully qualified name of the tensor in native format
The tensor to convert
Additional arguments for conversion
Returns: list[tuple[str, Any]]
List of (fqn, tensor) tuples in HuggingFace format
Convert HF checkpoint to native format.
- Dequantize FP8 tensors if scale_inv buffers are provided
- Aggregate per-expert weights into grouped tensors
- If device_mesh is provided, only load experts needed for the current rank
Convert from native model state dict to HuggingFace format. Automatically detects format based on backend.dispatcher configuration.
Slice scale_inv tensor to match a DTensor weight’s local portion.
When weight is sharded via DTensor but scale_inv is a regular tensor, we need to extract only the scale blocks that correspond to the local portion of the weight.
Parameters:
The full (global) scale_inv tensor
The DTensor weight (has device_mesh and placements)
The local portion of the weight
The FP8 quantization block size (default 128)
Returns: torch.Tensor
The sliced scale_inv tensor matching the local weight’s blocks
Create a scale_inv tensor for a weight.
Note: scale_inv is always created as a regular tensor (not DTensor) because the scale_inv shape (based on 128x128 blocks) doesn’t align with DTensor sharding boundaries. During dequantization, _slice_scale_for_dtensor handles extracting the correct scale blocks for DTensor weights.
Parameters:
The weight tensor (may be a DTensor)
The FP8 quantization block size
Returns: torch.Tensor
scale_inv tensor with shape based on GLOBAL weight shape
Check if a key should be quantized based on its name.