bridge.models.conversion.utils#

Module Contents#

Functions#

weights_verification_table

Returns a table comparing weights between a Hugging Face model and a Megatron-LM model.

get_module_and_param_from_name

Get parameter from specific VP stage, ensuring that parameter attributes are preserved. Supports both absolute and relative parameter names.

remove_non_pickleables

Remove non-pickleable objects from a configuration object recursively.

extract_sort_key

Extract sorting key based on layer and expert numbers.

API#

bridge.models.conversion.utils.weights_verification_table(bridge, megatron_model) rich.table.Table#

Returns a table comparing weights between a Hugging Face model and a Megatron-LM model.

Parameters:
  • bridge (AutoBridge) – The bridge object containing model information.

  • megatron_model – The Megatron-LM model instance.

Returns:

A rich Table object with the comparison.

Return type:

Table

bridge.models.conversion.utils.get_module_and_param_from_name(
models: megatron.core.transformer.module.MegatronModule | List[megatron.core.transformer.module.MegatronModule],
param_name: str,
vp_stage: Optional[int] = None,
) Tuple[torch.nn.Module, torch.Tensor] | Tuple[torch.nn.Module, torch.Tensor, Tuple]#

Get parameter from specific VP stage, ensuring that parameter attributes are preserved. Supports both absolute and relative parameter names.

Parameters:
  • models – List of Megatron model instances or a submodule

  • param_name – Dot-separated parameter name (can be absolute or relative to models)

  • vp_stage – Virtual pipeline stage index (None for single stage)

Returns:

Tuple of (module, parameter) where module owns the parameter

Raises:

ValueError – If vp_stage is out of range or parameter doesn’t exist

.. rubric:: Examples

Basic usage with full model:

module, param = get_module_and_param_from_name( … models=full_model, … param_name=”transformer.layers.0.attention.query.weight” … )

Usage with model list and VP stage:

module, param = get_module_and_param_from_name( … models=[model1, model2, model3], … param_name=”layers.0.mlp.dense.bias”, … vp_stage=1 … )

Usage with submodule and relative path:

linear_module = model.transformer.layers[0].mlp.dense module, param = get_module_and_param_from_name( … models=linear_module, … param_name=”weight” … )

Usage with submodule and absolute path (automatic suffix matching):

linear_module = model.transformer.layers[0].mlp.dense module, param = get_module_and_param_from_name( … models=linear_module, … param_name=”transformer.layers.0.mlp.dense.weight” … )

Automatically matches “weight” suffix and returns the parameter#

Edge case with partial path matching:

attention_module = model.transformer.layers[0].attention module, param = get_module_and_param_from_name( … models=attention_module, … param_name=”layers.0.attention.query.weight” … )

Matches “query.weight” suffix within the attention module#

bridge.models.conversion.utils.remove_non_pickleables(
obj,
max_depth: int = 2,
current_depth: int = 0,
)#

Remove non-pickleable objects from a configuration object recursively.

This utility function identifies and removes objects that cannot be pickled for inter-process communication, including functions, bound methods, partial functions, and other problematic callables.

Parameters:
  • obj – The object to clean

  • max_depth – Maximum recursion depth (default: 2)

  • current_depth – Current recursion depth (internal use)

Returns:

The cleaned object with non-pickleables removed

bridge.models.conversion.utils.extract_sort_key(param_name: str)#

Extract sorting key based on layer and expert numbers.