Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

Large Language Models

NeMo 2.0 has everything needed to train Large Language Models, including setting up the compute cluster, downloading data, and selecting model hyperparameters. NeMo 2.0 uses NeMo-Run to make it easy to scale LLMs to thousands of GPUs. The following LLMs are currently supported in NeMo 2.0:

Default configurations are provided for each model. The default configurations provided are outlined in the model-specific documentation linked above. Every configuration can be modified in order to train on new datasets or test new model hyperparameters.