bridge.recipes.nemotronh.nemotron_nano_v2#
Module Contents#
Classes#
Typed options accepted by Nemotron Nano v2 recipe helper functions. |
|
Typed options accepted by Nemotron Nano v2 finetuning recipe helper functions. |
Functions#
Return a pre-training config for Nemotron Nano 9B v2. |
|
Return a pre-training config for Nemotron Nano 12B v2. |
|
Return a finetuning config for Nemotron Nano 9B v2. |
|
Return a finetuning config for Nemotron Nano 12B v2. |
Data#
API#
- class bridge.recipes.nemotronh.nemotron_nano_v2.NemotronNanoV2CommonKwargs#
Bases:
typing_extensions.TypedDictTyped options accepted by Nemotron Nano v2 recipe helper functions.
Initialization
Initialize self. See help(type(self)) for accurate signature.
- model_provider: megatron.bridge.models.nemotronh.NemotronNanoModelProvider9Bv2 | megatron.bridge.models.nemotronh.NemotronNanoModelProvider12Bv2#
None
- tokenizer_model: str | None#
None
- dir: str | None#
None
- name: str#
None
- data_paths: list[str] | None#
None
- data_args_path: str | None#
None
- train_data_path: list[str] | None#
None
- valid_data_path: list[str] | None#
None
- test_data_path: list[str] | None#
None
- per_split_data_args_path: str | None#
None
- mock: bool#
None
- tensor_model_parallel_size: int#
None
- pipeline_model_parallel_size: int#
None
- pipeline_dtype: torch.dtype | None#
None
- virtual_pipeline_model_parallel_size: int | None#
None
- context_parallel_size: int#
None
- sequence_parallel: bool#
None
- train_iters: int#
None
- global_batch_size: int#
None
- micro_batch_size: int#
None
- seq_length: int#
None
- lr: float#
None
- min_lr: float#
None
- lr_warmup_iters: int#
None
- lr_decay_iters: int | None#
None
- use_null_tokenizer: bool#
None
- precision_config: megatron.bridge.training.mixed_precision.MixedPrecisionConfig | str | None#
None
- comm_overlap_config: megatron.bridge.training.comm_overlap.CommOverlapConfig | None#
None
- enable_default_comm_overlap: bool#
None
- class bridge.recipes.nemotronh.nemotron_nano_v2.NemotronNanoV2FinetuneKwargs#
Bases:
bridge.recipes.nemotronh.nemotron_nano_v2.NemotronNanoV2CommonKwargsTyped options accepted by Nemotron Nano v2 finetuning recipe helper functions.
Initialization
Initialize self. See help(type(self)) for accurate signature.
- pretrained_checkpoint: str | None#
None
- peft: str | megatron.bridge.peft.base.PEFT | None#
None
- packed_sequence: bool#
None
- finetune_lr: float#
None
- wandb_project: str | None#
None
- wandb_entity: str | None#
None
- wandb_exp_name: str | None#
None
- bridge.recipes.nemotronh.nemotron_nano_v2.nemotron_nano_9b_v2_pretrain_config(
- **user_kwargs: typing_extensions.Unpack[bridge.recipes.nemotronh.nemotron_nano_v2.NemotronNanoV2CommonKwargs],
Return a pre-training config for Nemotron Nano 9B v2.
This recipe is designed for single-node training (1 node). Default parallelism: TP=2, PP=1, SP=True.
See
_nemotronh_commonfor the full list of parameters.
- bridge.recipes.nemotronh.nemotron_nano_v2.nemotron_nano_12b_v2_pretrain_config(
- **user_kwargs: typing_extensions.Unpack[bridge.recipes.nemotronh.nemotron_nano_v2.NemotronNanoV2CommonKwargs],
Return a pre-training config for Nemotron Nano 12B v2.
This recipe is designed for single-node training (1 node). Default parallelism: TP=4, PP=1, SP=True.
Note: Uses FP8 precision by default. Communication overlap is disabled by default.
See
_nemotronh_commonfor the full list of parameters.
- bridge.recipes.nemotronh.nemotron_nano_v2.nemotron_nano_9b_v2_finetune_config(
- **user_kwargs: typing_extensions.Unpack[bridge.recipes.nemotronh.nemotron_nano_v2.NemotronNanoV2FinetuneKwargs],
Return a finetuning config for Nemotron Nano 9B v2.
Default configuration: 8 nodes, 64 GPUs
LoRA/DoRA: TP=2, PP=1, LR=1e-4
Full SFT: TP=2, PP=1, LR=5e-6
- bridge.recipes.nemotronh.nemotron_nano_v2.nemotron_nano_12b_v2_finetune_config(
- **user_kwargs: typing_extensions.Unpack[bridge.recipes.nemotronh.nemotron_nano_v2.NemotronNanoV2FinetuneKwargs],
Return a finetuning config for Nemotron Nano 12B v2.
Default configuration: 8 nodes, 64 GPUs
LoRA/DoRA: TP=4, PP=1, LR=1e-4
Full SFT: TP=4, PP=1, LR=5e-6
Note: Uses FP8 precision by default. Communication overlap is disabled by default.
- bridge.recipes.nemotronh.nemotron_nano_v2.__all__#
[‘nemotron_nano_9b_v2_pretrain_config’, ‘nemotron_nano_12b_v2_pretrain_config’, ‘nemotron_nano_9b_v2…