nemo_automodel.components.checkpoint.conversion_mapping#

Checkpoint conversion mappings for loading HuggingFace checkpoints.

This module provides conversion mappings for transforming checkpoint keys and tensors when loading models. It primarily uses the transformers library’s conversion_mapping module which handles both key renaming and tensor operations (merging/splitting).

For MoE models, the conversion handles:

  • Key renaming from checkpoint format (e.g., block_sparse_moe.experts.X.w1) to model format (e.g., mlp.experts.gate_up_proj)

  • Tensor merging for grouped expert formats (individual experts -> single 3D tensor)

The primary entry points are:

  • get_checkpoint_conversion_mapping(model_type): Get conversion rules for a model type

  • get_model_conversion_mapping(model, ...): Get all conversion rules for a model instance

  • requires_tensor_merging(model_type): Check if model needs tensor operations

Module Contents#

Functions#

requires_tensor_merging

Check if a model type requires tensor merging during checkpoint loading.

get_checkpoint_conversion_mapping

Get the checkpoint conversion mapping for a given model type.

get_model_conversion_mapping

Get all weight conversion mappings for a model instance.

get_combined_key_mapping

Get combined key mapping for simple regex-based key renaming.

is_transformers_conversion_available

Check if transformers conversion mapping is available.

Data#

API#

nemo_automodel.components.checkpoint.conversion_mapping._TRANSFORMERS_AVAILABLE#

False

nemo_automodel.components.checkpoint.conversion_mapping.MODELS_REQUIRING_TENSOR_MERGING#

None

nemo_automodel.components.checkpoint.conversion_mapping.requires_tensor_merging(model_type: str) bool#

Check if a model type requires tensor merging during checkpoint loading.

Some MoE models store expert weights in grouped format (single 3D tensor for all experts) but checkpoints store individual expert weights. These models require tensor merging that cannot be done via simple key renaming.

Parameters:

model_type – The model type string from config.model_type

Returns:

True if the model type requires tensor merging during loading.

nemo_automodel.components.checkpoint.conversion_mapping.get_checkpoint_conversion_mapping(
model_type: str,
) Optional[list]#

Get the checkpoint conversion mapping for a given model type.

This returns a list of WeightConverter and/or WeightRenaming objects from transformers that define how to convert checkpoint keys and tensors to model state dict format.

Parameters:

model_type – The model type string (e.g., “mixtral”, “qwen2_moe”, “phimoe”)

Returns:

A list of WeightConverter/WeightRenaming objects defining the conversion, or None if no conversion mapping is defined for this model type.

.. rubric:: Example

mapping = get_checkpoint_conversion_mapping(“mixtral”)

Returns list with WeightRenaming for gate and WeightConverter

for merging individual expert weights into grouped format

nemo_automodel.components.checkpoint.conversion_mapping.get_model_conversion_mapping(
model: torch.nn.Module,
key_mapping: Optional[dict[str, str]] = None,
hf_quantizer: Optional[object] = None,
add_legacy: bool = True,
) list#

Get all weight conversion mappings for a model instance.

This is the main entry point for getting conversion rules. It combines:

  1. Custom key_mapping if provided

  2. Model’s _checkpoint_conversion_mapping attribute (for VLMs)

  3. Model-type specific conversions (MoE merging, etc.)

  4. Legacy conversions (LayerNorm.gamma -> LayerNorm.weight, etc.)

  5. Quantizer-specific conversions if provided

Parameters:
  • model – The model instance to get conversions for

  • key_mapping – Optional custom key mapping (source -> target patterns)

  • hf_quantizer – Optional HuggingFace quantizer with additional conversions

  • add_legacy – Whether to include legacy LayerNorm conversions (default True)

Returns:

List of WeightConverter/WeightRenaming objects defining all conversions. Returns empty list if transformers is not available.

.. rubric:: Example

from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained(“mistralai/Mixtral-8x7B”) conversions = get_model_conversion_mapping(model)

Use conversions to transform checkpoint state dict

nemo_automodel.components.checkpoint.conversion_mapping.get_combined_key_mapping(
model_type: str,
model_key_mapping: Optional[dict[str, str]] = None,
) Optional[dict[str, str]]#

Get combined key mapping for simple regex-based key renaming.

This is a simpler alternative to get_model_conversion_mapping that only handles key renaming (not tensor operations). Useful when you just need to rename keys without merging tensors.

Note: For MoE models that require tensor merging, use get_model_conversion_mapping instead, which returns WeightConverter objects that handle both renaming and merging.

Parameters:
  • model_type – The model type string from config.model_type

  • model_key_mapping – Optional key mapping from the model’s _checkpoint_conversion_mapping attribute

Returns:

Combined key mapping dictionary (regex pattern -> replacement), or None if no mappings are defined.

nemo_automodel.components.checkpoint.conversion_mapping.is_transformers_conversion_available() bool#

Check if transformers conversion mapping is available.

Returns:

True if transformers library with conversion_mapping module is available.