Important

You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.

Parameter-Efficient Fine-Tuning (PEFT)#

PEFT is a popular technique used to efficiently finetune large language models for use in various downstream tasks. When finetuning with PEFT, the base model weights are frozen, and a few trainable adapter modules are injected into the model, resulting in a very small number (<< 1%) of trainble weights. With carefully chosen adapter modules and injection points, PEFT achieves comparable performance to full finetuning at a fraction of the computational and storage costs.

NeMo supports four PEFT methods which can be used with various transformer-based models. Here is a collection of conversion scripts that convert popular models from HF format to nemo format.

GPT 3

Nemotron

LLaMa 1/2

Falcon

Starcoder

Mistral

Mixtral

Gemma

T5

LoRA

P-Tuning

Adapters (Canonical)

IA3

Learn more about PEFT in NeMo with the Developer Quick Start which provides an overview on how PEFT works in NeMo. Read about the supported PEFT methods here. For a practical example, take a look at the Step-by-step Guide.

The API guide can be found here