bridge.recipes.deepseek.deepseek_v2#
Module Contents#
Functions#
Return a pre-training config for DeepSeek-V2-Lite. |
|
Return a pre-training config for DeepSeek-V2. |
API#
- bridge.recipes.deepseek.deepseek_v2.deepseek_v2_lite_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for DeepSeek-V2-Lite.
Recommended parallelism: TP=1, PP=1, EP=8.
- bridge.recipes.deepseek.deepseek_v2.deepseek_v2_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for DeepSeek-V2.
Recommended parallelism: TP=1, PP=4, EP=32.