Large Language Models#

NeMo Framework has everything needed to train Large Language Models, including setting up the compute cluster, downloading data, and selecting model hyperparameters. NeMo 2.0 uses NeMo-Run to make it easy to scale LLMs to thousands of GPUs.

The following LLMs are currently supported in NeMo 2.0:

Default configurations are provided for each model. The default configurations provided are outlined in the model-specific documentation linked above. Every configuration can be modified in order to train on new datasets or test new model hyperparameters.

Training long context models, or extending the context length of pre-trained models is also supported in NeMo:

Long Context Recipes / Extending Context Length

For information on deploying LLMs:

LLM Deployment Overview