Important

You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.

Large Language Models#

NeMo Framework has everything needed to train Large Language Models, including setting up the compute cluster, downloading data, and selecting model hyperparameters. NeMo 2.0 uses NeMo-Run to make it easy to scale LLMs to thousands of GPUs.

The following LLMs are currently supported in NeMo 2.0:

Default configurations are provided for each model. The default configurations provided are outlined in the model-specific documentation linked above. Every configuration can be modified in order to train on new datasets or test new model hyperparameters.

Training long context models, or extending the context length of pre-trained models is also supported in NeMo:

For information on deploying LLMs: