bridge.recipes.nemotronh.nemotron_3_super#
Module Contents#
Functions#
Return a pre-training config for Nemotron 3 Super (120B-A12B LatentMoE). |
|
Return a full SFT config for Nemotron 3 Super (120B-A12B LatentMoE). |
|
Return a PEFT config for Nemotron 3 Super (120B-A12B LatentMoE). |
Data#
API#
- bridge.recipes.nemotronh.nemotron_3_super.NEMOTRON_3_SUPER_HF_MODEL_ID#
ānvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16ā
- bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for Nemotron 3 Super (120B-A12B LatentMoE).
This is a Latent MoE model with Multi-Token Prediction (MTP). Default parallelism:
TP=4, PP=1, EP=8, SP=True
- Returns:
Pre-training configuration for Nemotron 3 Super.
- Return type:
- bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for Nemotron 3 Super (120B-A12B LatentMoE).
Default parallelism: TP=1, PP=1, EP=8, SP=True
- Returns:
ConfigContainer with all settings pre-configured for Nemotron 3 Super SFT.
- bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for Nemotron 3 Super (120B-A12B LatentMoE).
Default parallelism: TP=1, PP=1, EP=1, SP=True
- Parameters:
peft_scheme ā PEFT scheme - āloraā, ādoraā, or a custom PEFT instance.
- Returns:
ConfigContainer with all settings pre-configured for Nemotron 3 Super PEFT.
- bridge.recipes.nemotronh.nemotron_3_super.__all__#
[ānemotron_3_super_pretrain_configā, ānemotron_3_super_sft_configā, ānemotron_3_super_peft_configā]