bridge.perf_recipes.qwen.gb300.qwen3_moe#

GB300 performance recipes for Qwen3 MoE.

Module Contents#

Functions#

qwen3_235b_a22b_pretrain_64gpu_gb300_bf16_config

Qwen3 235B-A22B pretrain: 64× GB300, BF16, EP=64.

qwen3_235b_a22b_pretrain_64gpu_gb300_fp8cs_config

Qwen3 235B-A22B pretrain: 64× GB300, FP8 current-scaling, EP=64.

qwen3_235b_a22b_pretrain_64gpu_gb300_fp8mx_config

Qwen3 235B-A22B pretrain: 64× GB300, MXFP8, EP=64.

qwen3_30b_a3b_pretrain_8gpu_gb300_bf16_config

Qwen3 30B-A3B pretrain: 8× GB300, BF16, EP=8.

qwen3_30b_a3b_pretrain_8gpu_gb300_fp8cs_config

Qwen3 30B-A3B pretrain: 8× GB300, FP8 current-scaling, EP=8.

qwen3_30b_a3b_pretrain_8gpu_gb300_fp8mx_config

Qwen3 30B-A3B pretrain: 8× GB300, MXFP8, EP=8.

qwen3_235b_a22b_pretrain_256gpu_gb300_bf16_config

Qwen3 235B-A22B pretrain: 256× GB300, BF16, PP=4 EP=32.

qwen3_235b_a22b_pretrain_256gpu_gb300_fp8cs_config

Qwen3 235B-A22B pretrain: 256× GB300, FP8 current-scaling, PP=4 EP=32.

qwen3_235b_a22b_pretrain_256gpu_gb300_fp8mx_config

Qwen3 235B-A22B pretrain: 256× GB300, MXFP8, PP=4 EP=32.

qwen3_235b_a22b_pretrain_256gpu_gb300_fp8mx_large_scale_config

Qwen3 235B A22B pretrain: 256× GB300, FP8-MX, large-scale proxy (GBS=512).

qwen3_235b_a22b_pretrain_64gpu_gb300_nvfp4_config

Qwen3 235B A22B pretrain: 64× GB300, NVFP4 (same layout as FP8-CS).

qwen3_235b_a22b_pretrain_256gpu_gb300_nvfp4_config

Qwen3 235B A22B pretrain: 256× GB300, NVFP4 (same layout as FP8-CS).

qwen3_30b_a3b_pretrain_32gpu_gb300_bf16_config

Qwen3 30B-A3B pretrain: 32× GB300, BF16, legacy-scaled GBS.

qwen3_30b_a3b_pretrain_32gpu_gb300_fp8cs_config

Qwen3 30B-A3B pretrain: 32× GB300, FP8 current-scaling, legacy-scaled GBS.

qwen3_next_80b_a3b_pretrain_64gpu_gb300_bf16_config

Qwen3 Next 80B-A3B pretrain: 64× GB300, BF16, PP=2 VP=4 EP=32, hybridep.

qwen3_next_80b_a3b_pretrain_64gpu_gb300_fp8mx_config

Qwen3 Next 80B-A3B pretrain: 64× GB300, MXFP8 (same layout as BF16).

API#

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_64gpu_gb300_bf16_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B-A22B pretrain: 64× GB300, BF16, EP=64.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_64gpu_gb300_fp8cs_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B-A22B pretrain: 64× GB300, FP8 current-scaling, EP=64.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_64gpu_gb300_fp8mx_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B-A22B pretrain: 64× GB300, MXFP8, EP=64.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_30b_a3b_pretrain_8gpu_gb300_bf16_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 30B-A3B pretrain: 8× GB300, BF16, EP=8.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_30b_a3b_pretrain_8gpu_gb300_fp8cs_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 30B-A3B pretrain: 8× GB300, FP8 current-scaling, EP=8.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_30b_a3b_pretrain_8gpu_gb300_fp8mx_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 30B-A3B pretrain: 8× GB300, MXFP8, EP=8.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_256gpu_gb300_bf16_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B-A22B pretrain: 256× GB300, BF16, PP=4 EP=32.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_256gpu_gb300_fp8cs_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B-A22B pretrain: 256× GB300, FP8 current-scaling, PP=4 EP=32.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_256gpu_gb300_fp8mx_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B-A22B pretrain: 256× GB300, MXFP8, PP=4 EP=32.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_256gpu_gb300_fp8mx_large_scale_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B A22B pretrain: 256× GB300, FP8-MX, large-scale proxy (GBS=512).

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_64gpu_gb300_nvfp4_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B A22B pretrain: 64× GB300, NVFP4 (same layout as FP8-CS).

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_235b_a22b_pretrain_256gpu_gb300_nvfp4_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 235B A22B pretrain: 256× GB300, NVFP4 (same layout as FP8-CS).

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_30b_a3b_pretrain_32gpu_gb300_bf16_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 30B-A3B pretrain: 32× GB300, BF16, legacy-scaled GBS.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_30b_a3b_pretrain_32gpu_gb300_fp8cs_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 30B-A3B pretrain: 32× GB300, FP8 current-scaling, legacy-scaled GBS.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_next_80b_a3b_pretrain_64gpu_gb300_bf16_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 Next 80B-A3B pretrain: 64× GB300, BF16, PP=2 VP=4 EP=32, hybridep.

bridge.perf_recipes.qwen.gb300.qwen3_moe.qwen3_next_80b_a3b_pretrain_64gpu_gb300_fp8mx_config() megatron.bridge.perf_recipes.qwen.common.ConfigContainer#

Qwen3 Next 80B-A3B pretrain: 64× GB300, MXFP8 (same layout as BF16).