bridge.models.glm.glm_moe_mappings#
GLM MoE mapping helpers for fused expert weights in transformers 5.0+.
Module Contents#
Classes#
Mapping for fused expert gate+up projection weights. |
|
Mapping for fused expert down projection weights. |
Functions#
API#
- bridge.models.glm.glm_moe_mappings._select_expert_weight(
- hf_weights: torch.Tensor,
- expert_idx: int,
- bridge.models.glm.glm_moe_mappings._align_weight_to_shape(
- weight: torch.Tensor,
- target_shape: torch.Size,
- name: str,
- class bridge.models.glm.glm_moe_mappings._LooseGatedMLPMapping#
Bases:
megatron.bridge.models.conversion.param_mapping.GatedMLPMapping- _validate_patterns(*args, **kwargs)#
- class bridge.models.glm.glm_moe_mappings.GLMExpertGateUpProjMapping(
- megatron_param: str,
- hf_param: str,
- permute_dims=None,
Bases:
megatron.bridge.models.conversion.param_mapping.AutoMappingMapping for fused expert gate+up projection weights.
Initialization
- hf_to_megatron(
- hf_weights: torch.Tensor,
- megatron_module: torch.nn.Module,
- megatron_to_hf(
- megatron_weights: torch.Tensor,
- megatron_module: torch.nn.Module,
- _validate_patterns(*args, **kwargs)#
- class bridge.models.glm.glm_moe_mappings.GLMExpertDownProjMapping(
- megatron_param: str,
- hf_param: str,
- permute_dims=None,
Bases:
megatron.bridge.models.conversion.param_mapping.AutoMappingMapping for fused expert down projection weights.
Initialization
- hf_to_megatron(
- hf_weights: torch.Tensor,
- megatron_module: torch.nn.Module,
- _validate_patterns(*args, **kwargs)#