bridge.recipes.glm_vl.glm_45v#

GLM-4.5V finetuning recipes with parameterless API.

This module provides SFT and PEFT configurations for GLM-4.5V (106B MoE).

Module Contents#

Functions#

set_glm_45v_pipeline_model_parallel_layout

Set the GLM-4.5V pipeline model parallel layout.

glm_45v_sft_config

Return a full SFT config for GLM-4.5V (106B MoE).

glm_45v_peft_config

Return a PEFT config for GLM-4.5V (106B MoE).

API#

bridge.recipes.glm_vl.glm_45v.set_glm_45v_pipeline_model_parallel_layout(
model_cfg: megatron.bridge.models.gpt_provider.GPTModelProvider,
layout: Optional[Union[str, List[List[str]]]] = None,
is_peft: bool = False,
) None#

Set the GLM-4.5V pipeline model parallel layout.

GLM-4.5V (based on GLM-4.5 Air) has 46 decoder layers and no MTP layers. This function sets up predefined layouts for common PP/VP combinations.

Parameters:
  • model_cfg – The model provider configuration to modify.

  • layout – Optional custom layout. If None, uses predefined layouts based on PP/VP sizes.

  • is_peft – Whether the model is trained with PEFT.

bridge.recipes.glm_vl.glm_45v.glm_45v_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for GLM-4.5V (106B MoE).

Default configuration: 64 nodes, 512 GPUs

  • TP=1, PP=8, EP=16

  • LR=5e-6 (full SFT)

  • Sequence length: 8192

bridge.recipes.glm_vl.glm_45v.glm_45v_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for GLM-4.5V (106B MoE).

Default configuration: 8 nodes, 64 GPUs

  • TP=1, PP=8, EP=4

  • LR=1e-4 (PEFT)

  • Sequence length: 8192

Parameters:

peft_scheme – PEFT scheme - β€œlora”, β€œdora”, or a custom PEFT instance.