bridge.perf_recipes.deepseek.common#

Common helpers for deepseek performance recipes.

Module Contents#

Functions#

_deepseek_v3_common

Apply DeepSeek V3 perf defaults shared by the legacy workload configs.

_enable_deepseek_full_iteration_mxfp8

Apply legacy DeepSeek V3 HybridEP full-iteration MXFP8 settings.

_enable_deepseek_transformer_engine_graph

Apply legacy DeepSeek V3 Transformer Engine graph capture settings.

_apply_deepseek_v3_64gpu_gb300_fsdp_configs

Apply shared DeepSeek V3 64-GPU GB300 Megatron FSDP settings.

API#

bridge.perf_recipes.deepseek.common._deepseek_v3_common(
cfg: megatron.bridge.training.config.ConfigContainer,
) None#

Apply DeepSeek V3 perf defaults shared by the legacy workload configs.

bridge.perf_recipes.deepseek.common._enable_deepseek_full_iteration_mxfp8(
cfg: megatron.bridge.training.config.ConfigContainer,
*,
fp8_dot_product_attention: bool = False,
fp8_output_proj: bool = False,
) None#

Apply legacy DeepSeek V3 HybridEP full-iteration MXFP8 settings.

bridge.perf_recipes.deepseek.common._enable_deepseek_transformer_engine_graph(
cfg: megatron.bridge.training.config.ConfigContainer,
) None#

Apply legacy DeepSeek V3 Transformer Engine graph capture settings.

bridge.perf_recipes.deepseek.common._apply_deepseek_v3_64gpu_gb300_fsdp_configs(
cfg: megatron.bridge.training.config.ConfigContainer,
) None#

Apply shared DeepSeek V3 64-GPU GB300 Megatron FSDP settings.