Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Migrate PEFT Training and Inference from NeMo 1.0 to NeMo 2.0#
NeMo 1.0 (Previous Release)#
In NeMo 1.0, PEFT is enabled by setting the peft_scheme
field in the YAML file.
model:
peft:
peft_scheme: "lora"
restore_from_path: null
In code, the only differences between PEFT and full-parameter fine-tuning are the add_adapter
and load_adapters
functions.
NeMo 2.0 (New Release)#
In NeMo 2.0, PEFT is enabled by passing in the PEFT method callback to both the trainer and model:
lora = llm.peft.LoRA(
target_modules=['linear_qkv', 'linear_proj'], # 'linear_fc1', 'linear_fc2'],
dim=32,
)
trainer = nl.Trainer(..., callbacks=[lora])
model = llm.LlamaModel(..., model_transform=lora)
Using the fine-tune API, PEFT is enabled by passing in the peft
flag. The base model and adapter paths can also be specified.
from nemo.collections import llm
sft = llm.finetune(
...
peft=llm.peft.LoRA(target_modules=['linear_qkv', 'linear_proj'], dim=32),
...
)
sft.resume.import_path = "hf://..."
sft.resume.adapter_path = "/path/to/checkpoints/last"
Migration Steps#
Remove the
peft
section from your YAML config file.Create an instance of the LoRA class (or any PEFT method) with the configs you want (same as those in the
peft
section from YAML config).Pass the LoRA class to the “peft” field in llm.finetune. For inference, also pass in
resume.adapter_path
.
from nemo.collections import llm
sft = llm.finetune(
...
peft=llm.peft.LoRA(target_modules=['linear_qkv', 'linear_proj'], dim=32),
...
)
sft.resume.import_path = "hf://..."
sft.resume.adapter_path = "/path/to/checkpoints/last"
Some Notes on Migration#
In NeMo 1, the four LoRA targets were named [‘attention_qkv’, ‘attention_dense’, ‘mlp_fc1’, ‘mlp_fc2’]. These names were different from the actual names of the linear layers, which was confusing for some users. In NeMo 2, the four LoRA targets are renamed to match the linear layers: [‘linear_qkv’, ‘linear_proj’, ‘linear_fc1’, ‘linear_fc2’]