bridge.perf_recipes.deepseek.h100.deepseek_v3#

H100 performance recipes for DeepSeek V3.

Module Contents#

Functions#

deepseek_v3_pretrain_1024gpu_h100_bf16_config

DeepSeek V3 pretrain: 1024× H100, BF16.

deepseek_v3_pretrain_1024gpu_h100_fp8cs_config

DeepSeek V3 pretrain: 1024× H100, FP8 current-scaling.

deepseek_v3_pretrain_1024gpu_h100_fp8sc_config

DeepSeek V3 pretrain: 1024× H100, FP8-SC (VP=2, auto-applied default PP layout).

deepseek_v3_pretrain_64gpu_h100_bf16_config

DeepSeek V3 pretrain: 64× H100, BF16 (1024-GPU layout with legacy-scaled GBS).

deepseek_v3_pretrain_64gpu_h100_fp8cs_config

DeepSeek V3 pretrain: 64× H100, FP8 current-scaling (standard tensorwise).

deepseek_v3_pretrain_1024gpu_h100_fp8sc_large_scale_config

DeepSeek V3 pretrain: 1024× H100, FP8-SC, large-scale proxy (GBS=1024).

API#

bridge.perf_recipes.deepseek.h100.deepseek_v3.deepseek_v3_pretrain_1024gpu_h100_bf16_config() megatron.bridge.perf_recipes.deepseek.common.ConfigContainer#

DeepSeek V3 pretrain: 1024× H100, BF16.

bridge.perf_recipes.deepseek.h100.deepseek_v3.deepseek_v3_pretrain_1024gpu_h100_fp8cs_config() megatron.bridge.perf_recipes.deepseek.common.ConfigContainer#

DeepSeek V3 pretrain: 1024× H100, FP8 current-scaling.

bridge.perf_recipes.deepseek.h100.deepseek_v3.deepseek_v3_pretrain_1024gpu_h100_fp8sc_config() megatron.bridge.perf_recipes.deepseek.common.ConfigContainer#

DeepSeek V3 pretrain: 1024× H100, FP8-SC (VP=2, auto-applied default PP layout).

bridge.perf_recipes.deepseek.h100.deepseek_v3.deepseek_v3_pretrain_64gpu_h100_bf16_config() megatron.bridge.perf_recipes.deepseek.common.ConfigContainer#

DeepSeek V3 pretrain: 64× H100, BF16 (1024-GPU layout with legacy-scaled GBS).

bridge.perf_recipes.deepseek.h100.deepseek_v3.deepseek_v3_pretrain_64gpu_h100_fp8cs_config() megatron.bridge.perf_recipes.deepseek.common.ConfigContainer#

DeepSeek V3 pretrain: 64× H100, FP8 current-scaling (standard tensorwise).

bridge.perf_recipes.deepseek.h100.deepseek_v3.deepseek_v3_pretrain_1024gpu_h100_fp8sc_large_scale_config() megatron.bridge.perf_recipes.deepseek.common.ConfigContainer#

DeepSeek V3 pretrain: 1024× H100, FP8-SC, large-scale proxy (GBS=1024).