bridge.recipes.qwen_vl.qwen3_vl#
Qwen3-VL finetuning recipes with parameterless API.
This module provides SFT and PEFT configurations for Qwen3-VL MoE models (8B, 30B-A3B, 235B-A22B).
Module Contents#
Functions#
Create an EnergonProvider dataset config for Qwen3-VL recipes. |
|
Return a full SFT config for Qwen3-VL 8B (dense model). |
|
Return a full SFT config for Qwen3-VL 30B-A3B (MoE model). |
|
Return a full SFT config for Qwen3-VL 235B-A22B (MoE model). |
|
Return a PEFT config for Qwen3-VL 8B (dense model). |
|
Return a PEFT config for Qwen3-VL 30B-A3B (MoE model). |
|
Return a PEFT config for Qwen3-VL 235B-A22B (MoE model). |
|
Return a PEFT (LoRA/DoRA) config for Qwen3-VL 8B with Energon dataset. |
API#
- bridge.recipes.qwen_vl.qwen3_vl._make_energon_dataset(
- hf_path: str,
- seq_length: int,
- micro_batch_size: int,
- global_batch_size: int,
Create an EnergonProvider dataset config for Qwen3-VL recipes.
- bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_8b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3-VL 8B (dense model).
Default configuration: 1 node, 8 GPUs
TP=2, PP=1
LR=5e-6 (full SFT)
Sequence length: 4096
- bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_30b_a3b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3-VL 30B-A3B (MoE model).
Default configuration: 4 nodes, 32 GPUs
TP=1, PP=1, EP=8
LR=5e-6 (full SFT)
Sequence length: 4096
- bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_235b_a22b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3-VL 235B-A22B (MoE model).
Default configuration: 64 nodes, 512 GPUs
TP=4, PP=1, EP=32
LR=5e-6 (full SFT)
Sequence length: 4096
- bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_8b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3-VL 8B (dense model).
Default configuration: 1 node, 8 GPUs
TP=1, PP=1
LR=1e-4 (PEFT)
Sequence length: 4096
- Parameters:
peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.
- bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_30b_a3b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3-VL 30B-A3B (MoE model).
Default configuration: 1 node, 8 GPUs
TP=1, PP=1, EP=4
LR=1e-4 (PEFT)
Sequence length: 4096
- Parameters:
peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.
- bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_235b_a22b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3-VL 235B-A22B (MoE model).
Default configuration: 8 nodes, 64 GPUs
TP=1, PP=1, EP=16
LR=1e-4 (PEFT)
Sequence length: 4096
- Parameters:
peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.
- bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_8b_peft_energon_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT (LoRA/DoRA) config for Qwen3-VL 8B with Energon dataset.
Same as qwen3_vl_8b_peft_config but uses EnergonProvider instead of HF dataset. Set the dataset path via CLI override: dataset.path=/path/to/energon/dataset