bridge.recipes.deepseek.deepseek_v2#

Module Contents#

Functions#

deepseek_v2_lite_pretrain_config

Return a pre-training config for DeepSeek-V2-Lite.

deepseek_v2_pretrain_config

Return a pre-training config for DeepSeek-V2.

API#

bridge.recipes.deepseek.deepseek_v2.deepseek_v2_lite_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for DeepSeek-V2-Lite.

Recommended parallelism: TP=1, PP=1, EP=8.

bridge.recipes.deepseek.deepseek_v2.deepseek_v2_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for DeepSeek-V2.

Recommended parallelism: TP=1, PP=4, EP=32.