bridge.models.stepfun.step37_bridge#
Step3.7 multimodal bridge.
Registers a :class:MegatronModelBridge for the upstream
Step3p7ForConditionalGeneration HF architecture (model_type step37).
The bridge:
Re-uses :class:
Step35Bridge’s text-decoder logic forprovider_bridgeby delegating to a synthetic Step-3.5 HF wrapper (the Step3.7 HF config exposes its Step-3.5 text fields underhf_config.text_config).Adds vision-tower configuration (vision_config, image_token_id, projector_bias, understand_projector_stride) on top of the resulting provider.
Defines an HF↔Megatron parameter mapping registry that prefixes every Step-3.5 text mapping with
language_model.(since- class:
Step37Modelwraps the text GPTModel) and adds directvision_model.*AutoMappings for the PE-G/14 trunk + downsamplers, plus a top-levelvit_large_projector.weightmapping.
Module Contents#
Classes#
Megatron Bridge for Step3.7 (Step-3.5 text + Perception-Encoder G/14 vision). |
Functions#
Prefix a Step-3.5 megatron_param with the |
Data#
API#
- bridge.models.stepfun.step37_bridge.logger#
‘getLogger(…)’
- bridge.models.stepfun.step37_bridge._LM_PREFIX#
‘language_model.’
- bridge.models.stepfun.step37_bridge._lm(megatron_param: str) str#
Prefix a Step-3.5 megatron_param with the
language_model.namespace.
- class bridge.models.stepfun.step37_bridge.Step37Bridge#
Bases:
megatron.bridge.models.conversion.model_bridge.MegatronModelBridgeMegatron Bridge for Step3.7 (Step-3.5 text + Perception-Encoder G/14 vision).
.. rubric:: Example
from megatron.bridge import AutoBridge bridge = AutoBridge.from_hf_pretrained( … “/path/to/step3p7_flash_bf16”, trust_remote_code=True … ) provider = bridge.to_megatron_provider()
- CONFIG_MAPPING#
None
- provider_bridge(
- hf_pretrained: megatron.bridge.models.hf_pretrained.causal_lm.PreTrainedCausalLM,
Convert a HuggingFace Step3.7 config into a :class:
Step37ModelProvider.Mirrors the qwen3-vl bridge pattern:
Pull the nested
text_configdirectly out of the top-level Step3.7Step37Configand run the framework helperself.hf_config_to_provider_kwargs(text_config)to populate the common architecture fields (num_layers/hidden_size/num_attention_heads/ffn_hidden_size/vocab_size/rotary_base/ etc.) via :attr:CONFIG_MAPPING. That helper useshasattr+getattr(..., None)internally, so fields that are absent on the Step3.7 text config (e.g. anything Step-3.5 carried at the top level of its config.json) are skipped cleanly.Construct :class:
Step37ModelProviderdirectly from the filtered kwargs (instead of delegating to :meth:Step35Bridge.provider_bridgevia a wrapper — that path was fragile because Step35Bridge does a number of barehf_config.Xreads that crash on missing fields likezero_centeredoruse_qk_norm).Apply Step-3.5 text-decoder overrides with explicit
getattr(text_config, name, default)for every field that may or may not be present in the released Step3.7text_config.Finally attach Step3.7 vision / multimodal fields from the top-level
hf_config.
- mapping_registry() megatron.bridge.models.conversion.mapping_registry.MegatronMappingRegistry#
Return the full text + vision parameter mapping registry.
Text mappings replicate :meth:
Step35Bridge.mapping_registrywith alanguage_model.prefix on every Megatron-side path (since- Class:
Step37Modelwraps the Step-3.5GPTModelunderself.language_model). Vision mappings are direct AutoMappings — the Megatron module structure mirrors the HF safetensors layout.
- bridge.models.stepfun.step37_bridge.__all__#
[‘Step37Bridge’]