Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.
Large Language Models
The NVIDIA NeMo™ Framework has everything needed to train Large Language Models. This includes setting up the compute cluster, downloading data, and selecting model hyperparameters. The default configurations for each model and task are tested on a regular basis and every configuration can be modified in order to train on new datasets or test new model hyperparameters.
You can find more details that are applicable to all models below.
Additionally, NVIDIA NeMo™ Framework has seamless integrations for several community large language models, providing access to a comprehensive suite of tools for everything from training to deployment. For model specific details, see the relevant section below.
For each model, you will find comprehensive instructions on:
Data preparation
Training and Evaluation
Parameter-Efficient Fine-Tuning (PEFT)
Exporting the Model to TensorRT-LLM
Deployment