bridge.models.gpt_oss.gpt_oss_bridge
#
Module Contents#
Classes#
Megatron Hub Bridge for GPT-OSS models. |
|
MLPDownProj for expert weights GPT-OSS models. |
|
MLPGateUpProj for expert weights GPT-OSS models. |
Functions#
API#
- class bridge.models.gpt_oss.gpt_oss_bridge.GPTOSSBridge#
Bases:
megatron.bridge.models.conversion.model_bridge.MegatronModelBridge
Megatron Hub Bridge for GPT-OSS models.
As a user you would not use this bridge directly, but through
AutoBridge
... rubric:: Example
from megatron.bridge import AutoBridge bridge = AutoBridge.from_hf_pretrained(“openai/gpt-oss-model”) provider = bridge.to_megatron_provider()
Initialization
- provider_bridge(
- hf_pretrained: megatron.bridge.models.hf_pretrained.causal_lm.PreTrainedCausalLM | transformers.GptOssConfig,
- maybe_modify_loaded_hf_weight(
- hf_param: str | dict[str, str],
- hf_state_dict: Mapping[str, torch.Tensor],
Load weights from HuggingFace state dict and dequantize if necessary.
- maybe_modify_converted_hf_weight(
- task: megatron.bridge.models.conversion.model_bridge.WeightConversionTask,
- converted_weights_dict: Dict[str, torch.Tensor],
- mapping_registry() megatron.bridge.models.conversion.mapping_registry.MegatronMappingRegistry #
Return MegatronMappingRegistry containing parameter mappings from HF to Megatron format. Based on the GPT-OSS importer code provided.
- class bridge.models.gpt_oss.gpt_oss_bridge.GPTOSSMLPDownProjMapping(
- megatron_param: str,
- hf_param: str,
- permute_dims: Optional[Tuple[int, ...]] = None,
Bases:
megatron.bridge.models.conversion.param_mapping.AutoMapping
MLPDownProj for expert weights GPT-OSS models.
Initialization
- hf_to_megatron(
- hf_weights: torch.Tensor,
- megatron_module: torch.nn.Module,
- megatron_to_hf(
- megatron_weights: torch.Tensor,
- megatron_module: torch.nn.Module,
- _validate_patterns(*args, **kwargs)#
- class bridge.models.gpt_oss.gpt_oss_bridge.GPTOSSMLPGateUpProjMapping(
- megatron_param: str,
- hf_param: str,
- permute_dims: Optional[Tuple[int, ...]] = None,
Bases:
megatron.bridge.models.conversion.param_mapping.AutoMapping
MLPGateUpProj for expert weights GPT-OSS models.
Initialization
- static _interleave(gate_up_proj)#
- _uninterleave(elem)#
- hf_to_megatron(
- hf_weights: Union[torch.Tensor, Dict],
- megatron_module: torch.nn.Module,
- megatron_to_hf(
- megatron_weights: torch.Tensor,
- megatron_module: torch.nn.Module,
- _validate_patterns(*args, **kwargs)#
- bridge.models.gpt_oss.gpt_oss_bridge._dequantize_mxfp4(
- blocks: torch.Tensor,
- scales: torch.Tensor,
- *,
- dtype: torch.dtype = torch.bfloat16,
- rows_per_chunk: int = 32768 * 1024,