bridge.models.kimi_vl.kimi_k25_vl_provider#

Module Contents#

Classes#

KimiK25VLModelProvider

Model provider for Kimi K2.5 VL (Vision-Language) Models.

API#

class bridge.models.kimi_vl.kimi_k25_vl_provider.KimiK25VLModelProvider#

Bases: megatron.bridge.models.mla_provider.MLAModelProvider

Model provider for Kimi K2.5 VL (Vision-Language) Models.

Inherits language model configuration from MLAModelProvider. Core architecture parameters (num_layers, hidden_size, MoE config, MLA config, etc.) are populated by provider_bridge in the bridge class via hf_config_to_provider_kwargs.

Only VLM-specific fields (vision config, token IDs, freeze options, etc.) that are NOT part of the language model are defined here.

The vision component (MoonViT3d + PatchMergerMLP) is dynamically loaded from the HuggingFace model repository at runtime via trust_remote_code.

scatter_embedding_sequence_parallel: bool#

False

vision_config: Any#

None

hf_model_path: Optional[str]#

None

bos_token_id: int#

163584

eos_token_id: int#

163585

image_token_id: int#

163605

media_placeholder_token_id: int#

163605

pad_token_id: int#

163839

ignore_index: int#

None

freeze_language_model: bool#

False

freeze_vision_model: bool#

False

freeze_vision_projection: bool#

False

generation_config: Any | None#

None

provide(pre_process=None, post_process=None, vp_stage=None)#

Provide a KimiK25VL model instance with vision and language components.

provide_language_model(
pre_process=None,
post_process=None,
vp_stage=None,
) megatron.core.models.gpt.GPTModel#

Provide just the language model component (MoE with MLA) without vision.