bridge.recipes.nemotronh.nemotron_3_super#

Module Contents#

Functions#

nemotron_3_super_pretrain_config

Return a pre-training config for Nemotron 3 Super (120B-A12B LatentMoE).

nemotron_3_super_sft_config

Return a full SFT config for Nemotron 3 Super (120B-A12B LatentMoE).

nemotron_3_super_peft_config

Return a PEFT config for Nemotron 3 Super (120B-A12B LatentMoE).

Data#

API#

bridge.recipes.nemotronh.nemotron_3_super.NEMOTRON_3_SUPER_HF_MODEL_ID#

ā€˜nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16’

bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Nemotron 3 Super (120B-A12B LatentMoE).

This is a Latent MoE model with Multi-Token Prediction (MTP). Default parallelism:

  • TP=4, PP=1, EP=8, SP=True

Returns:

Pre-training configuration for Nemotron 3 Super.

Return type:

ConfigContainer

bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Nemotron 3 Super (120B-A12B LatentMoE).

Default parallelism: TP=1, PP=1, EP=8, SP=True

Returns:

ConfigContainer with all settings pre-configured for Nemotron 3 Super SFT.

bridge.recipes.nemotronh.nemotron_3_super.nemotron_3_super_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Nemotron 3 Super (120B-A12B LatentMoE).

Default parallelism: TP=1, PP=1, EP=1, SP=True

Parameters:

peft_scheme – PEFT scheme - ā€œloraā€, ā€œdoraā€, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Nemotron 3 Super PEFT.

bridge.recipes.nemotronh.nemotron_3_super.__all__#

[ā€˜nemotron_3_super_pretrain_config’, ā€˜nemotron_3_super_sft_config’, ā€˜nemotron_3_super_peft_config’]