bridge.recipes.qwen.qwen3#
Module Contents#
Functions#
Return a pre-training config for Qwen3 0.6B. |
|
Return a pre-training config for Qwen3 1.7B. |
|
Return a pre-training config for Qwen3 4B. |
|
Return a pre-training config for Qwen3 8B. |
|
Return a pre-training config for Qwen3 14B. |
|
Return a pre-training config for Qwen3 32B. |
|
Return a full SFT config for Qwen3 600M. |
|
Return a full SFT config for Qwen3 1.7B. |
|
Return a full SFT config for Qwen3 4B. |
|
Return a full SFT config for Qwen3 8B. |
|
Return a full SFT config for Qwen3 14B. |
|
Return a full SFT config for Qwen3 32B. |
|
Return a PEFT config for Qwen3 600M. |
|
Return a PEFT config for Qwen3 1.7B. |
|
Return a PEFT config for Qwen3 4B. |
|
Return a PEFT config for Qwen3 8B. |
|
Return a PEFT config for Qwen3 14B. |
|
Return a PEFT config for Qwen3 32B. |
API#
- bridge.recipes.qwen.qwen3.qwen3_600m_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for Qwen3 0.6B.
Recommended parallelism: TP=1, PP=1 (fits on a single GPU).
- bridge.recipes.qwen.qwen3.qwen3_1p7b_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for Qwen3 1.7B.
Recommended parallelism: TP=1, PP=1 (fits on a single GPU).
- bridge.recipes.qwen.qwen3.qwen3_4b_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for Qwen3 4B.
Recommended parallelism: TP=2, PP=1.
- bridge.recipes.qwen.qwen3.qwen3_8b_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for Qwen3 8B.
Recommended parallelism: TP=4, PP=1.
- bridge.recipes.qwen.qwen3.qwen3_14b_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for Qwen3 14B.
Recommended parallelism: TP=8, PP=1.
- bridge.recipes.qwen.qwen3.qwen3_32b_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for Qwen3 32B.
Recommended parallelism: TP=8, PP=2 with recompute enabled for memory optimization.
- bridge.recipes.qwen.qwen3.qwen3_600m_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3 600M.
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_1p7b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3 1.7B.
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_4b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3 4B.
Recommended parallelism: TP=2, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_8b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3 8B.
Recommended parallelism: TP=4, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_14b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3 14B.
Recommended parallelism: TP=8, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_32b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Qwen3 32B.
Recommended parallelism: TP=8, PP=2 (2 nodes, 16 GPUs total) Includes recompute for memory optimization.
- bridge.recipes.qwen.qwen3.qwen3_600m_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3 600M.
- Parameters:
peft_scheme – PEFT scheme - ‘lora’, ‘dora’, or a PEFT instance. Default: ‘lora’
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_1p7b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3 1.7B.
- Parameters:
peft_scheme – PEFT scheme - ‘lora’, ‘dora’, or a PEFT instance. Default: ‘lora’
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_4b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3 4B.
- Parameters:
peft_scheme – PEFT scheme - ‘lora’, ‘dora’, or a PEFT instance. Default: ‘lora’
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_8b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3 8B.
- Parameters:
peft_scheme – PEFT scheme - ‘lora’, ‘dora’, or a PEFT instance. Default: ‘lora’
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_14b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3 14B.
- Parameters:
peft_scheme – PEFT scheme - ‘lora’, ‘dora’, or a PEFT instance. Default: ‘lora’
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs)
- bridge.recipes.qwen.qwen3.qwen3_32b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Qwen3 32B.
- Parameters:
peft_scheme – PEFT scheme - ‘lora’, ‘dora’, or a PEFT instance. Default: ‘lora’
Recommended parallelism: TP=1, PP=1 (1 node, 8 GPUs) Includes recompute for memory optimization.