bridge.models.stepfun.configuration_step37#
Step3.7 HF PretrainedConfig surrogate.
Mirrors configuration_step35.py. The real Step37Config ships with the
upstream checkpoint at
stepfun-ai/step3p7_flash_bf16/configuration_step3p7.py and is loaded via
trust_remote_code=True at inference time. This file exists so the
Megatron-Bridge package can be self-describing — Step37Config /
Step37TextConfig / Step37VisionConfig here surface the same fields
the bridge reads in Step37Bridge.provider_bridge, without requiring the
remote-code shim to be on sys.path.
When the upstream config ships on HF, Step37Bridge can be retargeted at
the upstream class; until then the Auto* classes pick the right config via
auto_map in the checkpoint’s config.json.
Module Contents#
Classes#
HF-style config for the PE-G/14 vision tower used by Step3.7. |
|
HF-style text-decoder config for Step3.7. |
|
Top-level HF-style config for Step3.7 (the multimodal wrapper). |
Data#
API#
- class bridge.models.stepfun.configuration_step37.Step37VisionConfig(
- width: int = 1536,
- layers: int = 47,
- heads: int = 16,
- num_channels: int = 3,
- image_size: int = 728,
- mlp_ratio: float = 8960 / 1536,
- patch_size: int = 14,
- hidden_act: str = 'quick_gelu',
- layer_norm_eps: float = 1e-05,
- use_cls_token: bool = False,
- use_ln_pre: bool = True,
- use_ln_post: bool = False,
- use_abs_posemb: bool = True,
- use_rope2d: bool = True,
- ls_init_value: float = 0.1,
- **kwargs,
Bases:
transformers.configuration_utils.PretrainedConfigHF-style config for the PE-G/14 vision tower used by Step3.7.
Initialization
- model_type#
‘perception_encoder’
- class bridge.models.stepfun.configuration_step37.Step37TextConfig#
Bases:
megatron.bridge.models.stepfun.configuration_step35.Step35ConfigHF-style text-decoder config for Step3.7.
Identical schema to :class:
Step35Config— Step3.7’s text backbone is Step-3.5. Keeping a distinct subclass makes future divergence trivial.- model_type#
‘step3p5’
- class bridge.models.stepfun.configuration_step37.Step37Config(
- vision_config: Optional[Union[dict, bridge.models.stepfun.configuration_step37.Step37VisionConfig]] = None,
- text_config: Optional[Union[dict, bridge.models.stepfun.configuration_step37.Step37TextConfig]] = None,
- understand_projector_stride: int = 2,
- projector_bias: bool = False,
- image_token_id: int = 128001,
- **kwargs: Any,
Bases:
transformers.configuration_utils.PretrainedConfigTop-level HF-style config for Step3.7 (the multimodal wrapper).
Initialization
- model_type#
‘step3p7’
- architectures#
[‘Step3p7ForConditionalGeneration’]
- bridge.models.stepfun.configuration_step37.__all__#
[‘Step37Config’, ‘Step37TextConfig’, ‘Step37VisionConfig’]