nemo_automodel.components.checkpoint.conversion_mapping

View as Markdown

Checkpoint conversion mappings for loading HuggingFace checkpoints.

This module provides conversion mappings for transforming checkpoint keys and tensors when loading models. It primarily uses the transformers library’s conversion_mapping module which handles both key renaming and tensor operations (merging/splitting).

For MoE models, the conversion handles:

  • Key renaming from checkpoint format (e.g., block_sparse_moe.experts.X.w1) to model format (e.g., mlp.experts.gate_up_proj)
  • Tensor merging for grouped expert formats (individual experts -> single 3D tensor)

The primary entry points are:

  • get_checkpoint_conversion_mapping(model_type): Get conversion rules for a model type
  • get_model_conversion_mapping(model, ...): Get all conversion rules for a model instance
  • requires_tensor_merging(model_type): Check if model needs tensor operations

Module Contents

Functions

NameDescription
get_checkpoint_conversion_mappingGet the checkpoint conversion mapping for a given model type.
get_combined_key_mappingGet combined key mapping for simple regex-based key renaming.
get_model_conversion_mappingGet all weight conversion mappings for a model instance.
is_transformers_conversion_availableCheck if transformers conversion mapping is available.
requires_tensor_mergingCheck if a model type requires tensor merging during checkpoint loading.

Data

MODELS_REQUIRING_TENSOR_MERGING

_TRANSFORMERS_AVAILABLE

_VLM_KEY_MAPPINGS

API

nemo_automodel.components.checkpoint.conversion_mapping.get_checkpoint_conversion_mapping(
model_type: str
) -> typing.Optional[list]

Get the checkpoint conversion mapping for a given model type.

This returns a list of WeightConverter and/or WeightRenaming objects from transformers that define how to convert checkpoint keys and tensors to model state dict format.

Parameters:

model_type
str

The model type string (e.g., “mixtral”, “qwen2_moe”, “phimoe”)

Returns: Optional[list]

A list of WeightConverter/WeightRenaming objects defining the conversion,

nemo_automodel.components.checkpoint.conversion_mapping.get_combined_key_mapping(
model_type: str,
model_key_mapping: typing.Optional[dict[str, str]] = None
) -> typing.Optional[dict[str, str]]

Get combined key mapping for simple regex-based key renaming.

This is a simpler alternative to get_model_conversion_mapping that only handles key renaming (not tensor operations). Useful when you just need to rename keys without merging tensors.

Note: For MoE models that require tensor merging, use get_model_conversion_mapping instead, which returns WeightConverter objects that handle both renaming and merging.

Parameters:

model_type
str

The model type string from config.model_type

model_key_mapping
Optional[dict[str, str]]Defaults to None

Optional key mapping from the model’s _checkpoint_conversion_mapping attribute

Returns: Optional[dict[str, str]]

Combined key mapping dictionary (regex pattern -> replacement),

nemo_automodel.components.checkpoint.conversion_mapping.get_model_conversion_mapping(
model: torch.nn.Module,
key_mapping: typing.Optional[dict[str, str]] = None,
hf_quantizer: typing.Optional[object] = None,
add_legacy: bool = True
) -> list

Get all weight conversion mappings for a model instance.

This is the main entry point for getting conversion rules. It combines:

  1. Custom key_mapping if provided
  2. Model’s _checkpoint_conversion_mapping attribute (for VLMs)
  3. Model-type specific conversions (MoE merging, etc.)
  4. Legacy conversions (LayerNorm.gamma -> LayerNorm.weight, etc.)
  5. Quantizer-specific conversions if provided

Parameters:

model
nn.Module

The model instance to get conversions for

key_mapping
Optional[dict[str, str]]Defaults to None

Optional custom key mapping (source -> target patterns)

hf_quantizer
Optional[object]Defaults to None

Optional HuggingFace quantizer with additional conversions

add_legacy
boolDefaults to True

Whether to include legacy LayerNorm conversions (default True)

Returns: list

List of WeightConverter/WeightRenaming objects defining all conversions.

nemo_automodel.components.checkpoint.conversion_mapping.is_transformers_conversion_available() -> bool

Check if transformers conversion mapping is available.

Returns: bool

True if transformers library with conversion_mapping module is available.

nemo_automodel.components.checkpoint.conversion_mapping.requires_tensor_merging(
model_type: str
) -> bool

Check if a model type requires tensor merging during checkpoint loading.

Some MoE models store expert weights in grouped format (single 3D tensor for all experts) but checkpoints store individual expert weights. These models require tensor merging that cannot be done via simple key renaming.

Parameters:

model_type
str

The model type string from config.model_type

Returns: bool

True if the model type requires tensor merging during loading.

nemo_automodel.components.checkpoint.conversion_mapping.MODELS_REQUIRING_TENSOR_MERGING = {'mixtral', 'minimax', 'phimoe', 'qwen2_moe', 'qwen3_moe', 'deepseek_v2', 'deeps...
nemo_automodel.components.checkpoint.conversion_mapping._TRANSFORMERS_AVAILABLE = True
nemo_automodel.components.checkpoint.conversion_mapping._VLM_KEY_MAPPINGS: dict[str, dict[str, str]] = {'gemma3': {'^language_model\\.model\\.': 'model.language_model.', '^vision_towe...