bridge.recipes.glm_vl.glm_45v#
GLM-4.5V finetuning recipes with parameterless API.
This module provides SFT and PEFT configurations for GLM-4.5V (106B MoE).
Module Contents#
Functions#
Set the GLM-4.5V pipeline model parallel layout. |
|
Return a full SFT config for GLM-4.5V (106B MoE). |
|
Return a PEFT config for GLM-4.5V (106B MoE). |
API#
- bridge.recipes.glm_vl.glm_45v.set_glm_45v_pipeline_model_parallel_layout(
- model_cfg: megatron.bridge.models.gpt_provider.GPTModelProvider,
- layout: Optional[Union[str, List[List[str]]]] = None,
- is_peft: bool = False,
Set the GLM-4.5V pipeline model parallel layout.
GLM-4.5V (based on GLM-4.5 Air) has 46 decoder layers and no MTP layers. This function sets up predefined layouts for common PP/VP combinations.
- Parameters:
model_cfg β The model provider configuration to modify.
layout β Optional custom layout. If None, uses predefined layouts based on PP/VP sizes.
is_peft β Whether the model is trained with PEFT.
- bridge.recipes.glm_vl.glm_45v.glm_45v_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for GLM-4.5V (106B MoE).
Default configuration: 64 nodes, 512 GPUs
TP=1, PP=8, EP=16
LR=5e-6 (full SFT)
Sequence length: 8192
- bridge.recipes.glm_vl.glm_45v.glm_45v_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for GLM-4.5V (106B MoE).
Default configuration: 8 nodes, 64 GPUs
TP=1, PP=8, EP=4
LR=1e-4 (PEFT)
Sequence length: 8192
- Parameters:
peft_scheme β PEFT scheme - βloraβ, βdoraβ, or a custom PEFT instance.