bridge.recipes.nemotronh.nemotron_3_nano#

Module Contents#

Functions#

nemotron_3_nano_pretrain_config

Return a pre-training config for Nemotron 3 Nano (30B-A3B MoE).

nemotron_3_nano_sft_config

Return a full SFT config for Nemotron 3 Nano (30B-A3B MoE).

nemotron_3_nano_peft_config

Return a PEFT config for Nemotron 3 Nano (30B-A3B MoE).

Data#

API#

bridge.recipes.nemotronh.nemotron_3_nano.nemotron_3_nano_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Nemotron 3 Nano (30B-A3B MoE).

This is a MoE (Mixture of Experts) model with the following default parallelism:

  • TP=4, PP=1, EP=8, SP=True

  • DeepEP enabled for MoE token dispatch

Returns:

Pre-training configuration for Nemotron 3 Nano.

Return type:

ConfigContainer

bridge.recipes.nemotronh.nemotron_3_nano.nemotron_3_nano_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Nemotron 3 Nano (30B-A3B MoE).

Default parallelism: TP=1, PP=1, EP=8, SP=False

Returns:

ConfigContainer with all settings pre-configured for Nemotron 3 Nano SFT.

bridge.recipes.nemotronh.nemotron_3_nano.nemotron_3_nano_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Nemotron 3 Nano (30B-A3B MoE).

Default parallelism: TP=1, PP=1, EP=8, SP=False

Parameters:

peft_scheme – PEFT scheme - ā€œloraā€, ā€œdoraā€, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Nemotron 3 Nano PEFT.

bridge.recipes.nemotronh.nemotron_3_nano.__all__#

[ā€˜nemotron_3_nano_pretrain_config’, ā€˜nemotron_3_nano_sft_config’, ā€˜nemotron_3_nano_peft_config’]