`bridge.models.stepfun.configuration_step37`#

Step3.7 HF PretrainedConfig surrogate.

Mirrors configuration_step35.py. The real Step37Config ships with the upstream checkpoint at stepfun-ai/step3p7_flash_bf16/configuration_step3p7.py and is loaded via trust_remote_code=True at inference time. This file exists so the Megatron-Bridge package can be self-describing — Step37Config / Step37TextConfig / Step37VisionConfig here surface the same fields the bridge reads in Step37Bridge.provider_bridge, without requiring the remote-code shim to be on sys.path.

When the upstream config ships on HF, Step37Bridge can be retargeted at the upstream class; until then the Auto* classes pick the right config via auto_map in the checkpoint’s config.json.

Module Contents#

Classes#

`Step37VisionConfig`	HF-style config for the PE-G/14 vision tower used by Step3.7.
`Step37TextConfig`	HF-style text-decoder config for Step3.7.
`Step37Config`	Top-level HF-style config for Step3.7 (the multimodal wrapper).

Data#

__all__

API#

class bridge.models.stepfun.configuration_step37.Step37VisionConfig(

width: int = 1536,

layers: int = 47,

heads: int = 16,

num_channels: int = 3,

image_size: int = 728,

mlp_ratio: float = 8960 / 1536,

patch_size: int = 14,

hidden_act: str = 'quick_gelu',

layer_norm_eps: float = 1e-05,

use_cls_token: bool = False,

use_ln_pre: bool = True,

use_ln_post: bool = False,

use_abs_posemb: bool = True,

use_rope2d: bool = True,

ls_init_value: float = 0.1,

**kwargs,

)#

Bases: transformers.configuration_utils.PretrainedConfig

HF-style config for the PE-G/14 vision tower used by Step3.7.

Initialization

model_type#: ‘perception_encoder’

class bridge.models.stepfun.configuration_step37.Step37TextConfig#

Bases: megatron.bridge.models.stepfun.configuration_step35.Step35Config

HF-style text-decoder config for Step3.7.

Identical schema to :class:Step35Config — Step3.7’s text backbone is Step-3.5. Keeping a distinct subclass makes future divergence trivial.

model_type#: ‘step3p5’

class bridge.models.stepfun.configuration_step37.Step37Config(

vision_config: Optional[Union[dict, bridge.models.stepfun.configuration_step37.Step37VisionConfig]] = None,

text_config: Optional[Union[dict, bridge.models.stepfun.configuration_step37.Step37TextConfig]] = None,

understand_projector_stride: int = 2,

projector_bias: bool = False,

image_token_id: int = 128001,

**kwargs: Any,

)#