Kimi-K2.5-VL#

Kimi-K2.5-VL is a Moonshot AI vision-language model with a Kimi K2-style MoE + MLA language backbone and a vision encoder. Megatron Bridge supports it through a dedicated VLM bridge and provider.

Supported Variants#

  • Kimi-K2.5: https://huggingface.co/moonshotai/Kimi-K2.5

Architecture Notes#

  • Uses KimiK25ForConditionalGeneration on the Hugging Face side and KimiK25VLModel on the Megatron side.

  • The language backbone shares Kimi K2’s MoE + MLA structure.

  • The bridge carries top-level multimodal fields such as the media placeholder token ID into the provider.

  • INT4-packed expert weights are dequantized during import and re-quantized for compatible Hugging Face export.

Examples#

For conversion, inference, SFT, PEFT, Slurm launch scripts, and dataset notes, see the Kimi-K2.5-VL examples README.