nemo_automodel.components.checkpoint.conversion_mapping#
Checkpoint conversion mappings for loading HuggingFace checkpoints.
This module provides conversion mappings for transforming checkpoint keys and tensors when loading models. It primarily uses the transformers library’s conversion_mapping module which handles both key renaming and tensor operations (merging/splitting).
For MoE models, the conversion handles:
Key renaming from checkpoint format (e.g., block_sparse_moe.experts.X.w1) to model format (e.g., mlp.experts.gate_up_proj)
Tensor merging for grouped expert formats (individual experts -> single 3D tensor)
The primary entry points are:
get_checkpoint_conversion_mapping(model_type): Get conversion rules for a model typeget_model_conversion_mapping(model, ...): Get all conversion rules for a model instancerequires_tensor_merging(model_type): Check if model needs tensor operations
Module Contents#
Functions#
Check if a model type requires tensor merging during checkpoint loading. |
|
Get the checkpoint conversion mapping for a given model type. |
|
Get all weight conversion mappings for a model instance. |
|
Get combined key mapping for simple regex-based key renaming. |
|
Check if transformers conversion mapping is available. |
Data#
API#
- nemo_automodel.components.checkpoint.conversion_mapping._TRANSFORMERS_AVAILABLE#
False
- nemo_automodel.components.checkpoint.conversion_mapping.MODELS_REQUIRING_TENSOR_MERGING#
None
- nemo_automodel.components.checkpoint.conversion_mapping.requires_tensor_merging(model_type: str) bool#
Check if a model type requires tensor merging during checkpoint loading.
Some MoE models store expert weights in grouped format (single 3D tensor for all experts) but checkpoints store individual expert weights. These models require tensor merging that cannot be done via simple key renaming.
- Parameters:
model_type – The model type string from config.model_type
- Returns:
True if the model type requires tensor merging during loading.
- nemo_automodel.components.checkpoint.conversion_mapping.get_checkpoint_conversion_mapping(
- model_type: str,
Get the checkpoint conversion mapping for a given model type.
This returns a list of WeightConverter and/or WeightRenaming objects from transformers that define how to convert checkpoint keys and tensors to model state dict format.
- Parameters:
model_type – The model type string (e.g., “mixtral”, “qwen2_moe”, “phimoe”)
- Returns:
A list of WeightConverter/WeightRenaming objects defining the conversion, or None if no conversion mapping is defined for this model type.
.. rubric:: Example
mapping = get_checkpoint_conversion_mapping(“mixtral”)
Returns list with WeightRenaming for gate and WeightConverter
for merging individual expert weights into grouped format
- nemo_automodel.components.checkpoint.conversion_mapping.get_model_conversion_mapping(
- model: torch.nn.Module,
- key_mapping: Optional[dict[str, str]] = None,
- hf_quantizer: Optional[object] = None,
- add_legacy: bool = True,
Get all weight conversion mappings for a model instance.
This is the main entry point for getting conversion rules. It combines:
Custom key_mapping if provided
Model’s _checkpoint_conversion_mapping attribute (for VLMs)
Model-type specific conversions (MoE merging, etc.)
Legacy conversions (LayerNorm.gamma -> LayerNorm.weight, etc.)
Quantizer-specific conversions if provided
- Parameters:
model – The model instance to get conversions for
key_mapping – Optional custom key mapping (source -> target patterns)
hf_quantizer – Optional HuggingFace quantizer with additional conversions
add_legacy – Whether to include legacy LayerNorm conversions (default True)
- Returns:
List of WeightConverter/WeightRenaming objects defining all conversions. Returns empty list if transformers is not available.
.. rubric:: Example
from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained(“mistralai/Mixtral-8x7B”) conversions = get_model_conversion_mapping(model)
Use conversions to transform checkpoint state dict
- nemo_automodel.components.checkpoint.conversion_mapping.get_combined_key_mapping(
- model_type: str,
- model_key_mapping: Optional[dict[str, str]] = None,
Get combined key mapping for simple regex-based key renaming.
This is a simpler alternative to get_model_conversion_mapping that only handles key renaming (not tensor operations). Useful when you just need to rename keys without merging tensors.
Note: For MoE models that require tensor merging, use get_model_conversion_mapping instead, which returns WeightConverter objects that handle both renaming and merging.
- Parameters:
model_type – The model type string from config.model_type
model_key_mapping – Optional key mapping from the model’s
_checkpoint_conversion_mappingattribute
- Returns:
Combined key mapping dictionary (regex pattern -> replacement), or None if no mappings are defined.
- nemo_automodel.components.checkpoint.conversion_mapping.is_transformers_conversion_available() bool#
Check if transformers conversion mapping is available.
- Returns:
True if transformers library with conversion_mapping module is available.