bridge.models.qwen_vl.modelling_qwen3_vl.transformer_config#
Module Contents#
Classes#
Configuration for Qwen3-VL transformer with vision and language components. |
Functions#
Get the vision model config for Qwen3VL vision model. |
API#
- class bridge.models.qwen_vl.modelling_qwen3_vl.transformer_config.Qwen3VLTransformerConfig#
Bases:
megatron.core.transformer.transformer_config.TransformerConfigConfiguration for Qwen3-VL transformer with vision and language components.
- vocab_size: int#
64000
- language_max_sequence_length: int#
4096
- patch_size: int#
16
- temporal_patch_size: int#
2
- in_channels: int#
3
- spatial_merge_size: int#
2
- num_position_embeddings: int#
2304
4096
- apply_rotary_pos_emb_in_fp32: bool#
False
- deepstack_visual_indexes: List[int]#
‘field(…)’
- fp16_lm_cross_entropy: bool#
False
False
- rotary_percent: float#
1.0
- rotary_base: float#
10000
- mrope_section: List[int]#
‘field(…)’
- apply_rope_fusion: bool#
False
- image_token_id: int#
151655
- video_token_id: int#
151656
- vision_start_token_id: int#
151652
- hf_text_config: Optional[transformers.models.qwen3_vl.configuration_qwen3_vl.Qwen3VLTextConfig]#
None
- vision_dp_when_cp: bool#
False
- use_hf_vision_model: bool#
False
- bridge.models.qwen_vl.modelling_qwen3_vl.transformer_config.get_vision_model_config(hf_config, megatron_config=None)#
Get the vision model config for Qwen3VL vision model.