`bridge.recipes.nemotronh.nemotron_3_super`#

Module Contents#

Functions#

`nemotron_3_super_pretrain_config`	Return a pre-training config for Nemotron 3 Super (120B-A12B LatentMoE).
`nemotron_3_super_sft_config`	Return a full SFT config for Nemotron 3 Super (120B-A12B LatentMoE).
`nemotron_3_super_peft_config`	Return a PEFT config for Nemotron 3 Super (120B-A12B LatentMoE).

Data#

`NEMOTRON_3_SUPER_HF_MODEL_ID`
`__all__`

API#

bridge.recipes.nemotronh.nemotron_3_super.NEMOTRON_3_SUPER_HF_MODEL_ID#: ‘nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16’

bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_pretrain_config() → megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Nemotron 3 Super (120B-A12B LatentMoE).

This is a Latent MoE model with Multi-Token Prediction (MTP). Default parallelism:

TP=4, PP=1, EP=8, SP=True

Returns:: Pre-training configuration for Nemotron 3 Super.
Return type:: ConfigContainer

bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_sft_config() → megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Nemotron 3 Super (120B-A12B LatentMoE).

Default parallelism: TP=1, PP=1, EP=8, SP=True

Returns:: ConfigContainer with all settings pre-configured for Nemotron 3 Super SFT.

bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_peft_config( peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora', ) → megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Nemotron 3 Super (120B-A12B LatentMoE).

Default parallelism: TP=1, PP=1, EP=1, SP=True

Parameters:: peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.
Returns:: ConfigContainer with all settings pre-configured for Nemotron 3 Super PEFT.

bridge.recipes.nemotronh.nemotron_3_super.__all__#: [‘nemotron_3_super_pretrain_config’, ‘nemotron_3_super_sft_config’, ‘nemotron_3_super_peft_config’]

bridge.recipes.nemotronh.nemotron_3_super#

Module Contents#

Functions#

Data#

API#

`bridge.recipes.nemotronh.nemotron_3_super`#