Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Nemotron#
Nemotron is a Large Language Model (LLM) that can be integrated into a synthetic data generation pipeline to produce training data, assisting researchers and developers in building their own LLMs.
NeMo 2.0 Pretraining Recipes#
We provide recipes for pretraining nemotron models for the following sizes: 4B, 8B, 15B, 22B and 340B using NeMo 2.0 and NeMo-Run.
These recipes configure a run.Partial
for one of the nemo.collections.llm api functions introduced in NeMo 2.0.
The recipes are hosted in nemotron3_4b,
nemotron3_8b,
nemotron4_15b,
nemotron4_22b ,
and nemotron4_340b files.
Note
The pretraining recipes use the MockDataModule
for the data
argument. You are expected to replace the MockDataModule
with your own custom dataset.
We provide an example below on how to invoke the default recipe and override the data argument:
from nemo.collections import llm
pretrain = llm.nemotron3_8b.pretrain_recipe(
name="nemotron3_8b_pretraining",
ckpt_dir=f"/path/to/checkpoints",
num_nodes=2,
num_gpus_per_node=8,
)
# # To override the data argument
# dataloader = a_function_that_configures_your_custom_dataset(
# global_batch_size=global_batch_size,
# micro_batch_size=micro_batch_size,
# seq_length=pretrain.model.config.seq_length,
# )
# pretrain.data = dataloader
Note
The configuration in the recipes is done using the NeMo-Run run.Config
and run.Partial
configuration objects.
Please review the NeMo-Run documentation to learn more about its configuration and execution system.
Once you have your final configuration ready, you can execute it on any of the NeMo-Run supported executors. The simplest is the local executor, which just runs the pretraining locally in a separate process. You can use it as follows:
import nemo_run as run
run.run(pretrain, executor=run.LocalExecutor())
Additionally, you can also run it directly in the same Python process as follows:
run.run(pretrain, direct=True)
A comprehensive list of pretraining recipes that we currently support or plan to support soon is provided below for reference:
Recipe |
Status |
---|---|
Nemotron3 4B |
Yes |
Nemotron3 4B FP8 |
N/A |
Nemotron3 8B |
Yes |
Nemotron3 8B FP8 |
N/A |
Nemotron4 15B |
Yes |
Nemotron4 15B 16k |
Yes |
Nemotron4 15B 64k |
Yes |
Nemotron4 15B FP8 |
N/A |
Nemotron4 22B |
Yes |
Nemotron4 22B 16k |
Yes |
Nemotron4 22B 64k |
Yes |
Nemotron4 22B FP8 |
N/A |
Nemotron4 340B |
Yes |
Nemotron4 340B FP8 |
N/A |