bridge.models.stepfun.modelling_step37.transformer_config#
Step3.7 transformer config and vision-config helper.
Mirrors qwen_vl/modelling_qwen3_vl/transformer_config.py: the text-side
config is the standard Megatron TransformerConfig already used by Step-3.5,
extended with vision-tower fields. The HF StepRoboticsVisionEncoderConfig
is passed straight through to the Megatron vision module — no separate
Megatron-side TransformerConfig is constructed for the vision tower, since
the PE-G/14 trunk does not use any Megatron tensor-parallel primitives.
Module Contents#
Classes#
Step3.7 transformer config. |
Functions#
Return the HF vision config unchanged. |
API#
- class bridge.models.stepfun.modelling_step37.transformer_config.Step37TransformerConfig#
Bases:
megatron.core.transformer.transformer_config.TransformerConfigStep3.7 transformer config.
Extends the Step-3.5 text-decoder
TransformerConfigwith the multimodal fields thatStep37Modelreads at construction time. All Step-3.5 per-layer fields (layer_types,rotary_percents,rotary_base_per_layer,swiglu_limits,swiglu_limits_shared,attention_other_setting,sliding_attention_setting,head_wise_attn_gate) are inherited from the Step-3.5 model provider — this class only adds the vision-side fields.- vision_config: Any#
None
- image_token_id: int#
128001
- understand_projector_stride: int#
2
- projector_bias: bool#
False
- language_max_sequence_length: int#
262144
- bridge.models.stepfun.modelling_step37.transformer_config.get_vision_model_config(vision_cfg: Any) Any#
Return the HF vision config unchanged.
Step37VisionModelconsumes the HFStepRoboticsVisionEncoderConfigdirectly (it never uses Megatron tensor-parallel primitives), so this function is just a structural mirror ofqwen_vl/modelling_qwen3_vl/transformer_config.get_vision_model_configfor parity with the Qwen3-VL package shape. It is intentionally a no-op.