Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.
SFT and PEFT
Customizing models enables you to adapt a general pre-trained LLM to a specific use case or domain. This process results in a fine-tuned model that benefits from the extensive pretraining data, while also yielding more accurate outputs for the specific downstream task. Model customization is achieved through supervised fine-tuning and falls into two popular categories:
Full-Parameter Fine-Tuning, which is referred to as Supervised Fine-Tuning (SFT) in NeMo
Parameter-Efficient Fine-Tuning (PEFT)
In SFT, all of the model parameters are updated to produce outputs that are adapted to the task.
PEFT, on the other hand, tunes a much smaller number of parameters which are inserted into the base model at strategic locations. When fine-tuning with PEFT, the base model weights remain frozen, and only the adapter modules are trained. As a result, the number of trainable parameters is significantly reduced (<< 1%).
While SFT often yields the best possible results, PEFT methods can achieve nearly the same degree of accuracy while significantly reducing the computational cost. As language models continue to grow in size, PEFT is gaining popularity due to its lightweight requirements on training hardware.
NeMo supports SFT and five PEFT methods which can be used with various transformer-based models. Here is a collection of conversion scripts that convert popular models from HF format to nemo format.
SFT |
LoRA |
QLoRA |
P-tuning |
Adapters (Canonical) |
IA3 |
|
---|---|---|---|---|---|---|
GPT 3 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Nemotron |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Llama 1/2/3 & CodeLlama 2 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
ChatGLM 2/3 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Falcon |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
Starcoder 1/2 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Mistral |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Mixtral |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
Gemma 1/2 & CodeGemma |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Space Gemma |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
T5 |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
Learn more about SFT and PEFT in NeMo with the Developer Guide which provides an overview on how they works in NeMo. Read more about the supported PEFT methods here.
For an end-to-end example of LoRA tuning, take a look at the Step-by-step LoRA Notebok. For more details on running QLoRA, please visit NeMo QLoRA Guide. We also have many SFT and PEFT Examples for each model for you to play with.
The API guide can be found here