Kimi-K2.5-VL#
Kimi-K2.5-VL is a Moonshot AI vision-language model with a Kimi K2-style MoE + MLA language backbone and a vision encoder. Megatron Bridge supports it through a dedicated VLM bridge and provider.
Supported Variants#
Kimi-K2.5: https://huggingface.co/moonshotai/Kimi-K2.5
Architecture Notes#
Uses
KimiK25ForConditionalGenerationon the Hugging Face side andKimiK25VLModelon the Megatron side.The language backbone shares Kimi K2’s MoE + MLA structure.
The bridge carries top-level multimodal fields such as the media placeholder token ID into the provider.
INT4-packed expert weights are dequantized during import and re-quantized for compatible Hugging Face export.
Examples#
For conversion, inference, SFT, PEFT, Slurm launch scripts, and dataset notes, see the Kimi-K2.5-VL examples README.