bridge.recipes.qwen_vl.qwen3_vl#

Qwen3-VL finetuning recipes with parameterless API.

This module provides SFT and PEFT configurations for Qwen3-VL MoE models (8B, 30B-A3B, 235B-A22B).

Module Contents#

Functions#

_make_energon_dataset

Create an EnergonProvider dataset config for Qwen3-VL recipes.

qwen3_vl_8b_sft_config

Return a full SFT config for Qwen3-VL 8B (dense model).

qwen3_vl_30b_a3b_sft_config

Return a full SFT config for Qwen3-VL 30B-A3B (MoE model).

qwen3_vl_235b_a22b_sft_config

Return a full SFT config for Qwen3-VL 235B-A22B (MoE model).

qwen3_vl_8b_peft_config

Return a PEFT config for Qwen3-VL 8B (dense model).

qwen3_vl_30b_a3b_peft_config

Return a PEFT config for Qwen3-VL 30B-A3B (MoE model).

qwen3_vl_235b_a22b_peft_config

Return a PEFT config for Qwen3-VL 235B-A22B (MoE model).

qwen3_vl_8b_peft_energon_config

Return a PEFT (LoRA/DoRA) config for Qwen3-VL 8B with Energon dataset.

API#

bridge.recipes.qwen_vl.qwen3_vl._make_energon_dataset(
hf_path: str,
seq_length: int,
micro_batch_size: int,
global_batch_size: int,
) megatron.bridge.data.energon.energon_provider.EnergonProvider#

Create an EnergonProvider dataset config for Qwen3-VL recipes.

bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_8b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Qwen3-VL 8B (dense model).

Default configuration: 1 node, 8 GPUs

  • TP=2, PP=1

  • LR=5e-6 (full SFT)

  • Sequence length: 4096

bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_30b_a3b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Qwen3-VL 30B-A3B (MoE model).

Default configuration: 4 nodes, 32 GPUs

  • TP=1, PP=1, EP=8

  • LR=5e-6 (full SFT)

  • Sequence length: 4096

bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_235b_a22b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Qwen3-VL 235B-A22B (MoE model).

Default configuration: 64 nodes, 512 GPUs

  • TP=4, PP=1, EP=32

  • LR=5e-6 (full SFT)

  • Sequence length: 4096

bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_8b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Qwen3-VL 8B (dense model).

Default configuration: 1 node, 8 GPUs

  • TP=1, PP=1

  • LR=1e-4 (PEFT)

  • Sequence length: 4096

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_30b_a3b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Qwen3-VL 30B-A3B (MoE model).

Default configuration: 1 node, 8 GPUs

  • TP=1, PP=1, EP=4

  • LR=1e-4 (PEFT)

  • Sequence length: 4096

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_235b_a22b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Qwen3-VL 235B-A22B (MoE model).

Default configuration: 8 nodes, 64 GPUs

  • TP=1, PP=1, EP=16

  • LR=1e-4 (PEFT)

  • Sequence length: 4096

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

bridge.recipes.qwen_vl.qwen3_vl.qwen3_vl_8b_peft_energon_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT (LoRA/DoRA) config for Qwen3-VL 8B with Energon dataset.

Same as qwen3_vl_8b_peft_config but uses EnergonProvider instead of HF dataset. Set the dataset path via CLI override: dataset.path=/path/to/energon/dataset