Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

Parameter-Efficient Fine-Tuning (PEFT)

The NeMo Framework offers multiple specially curated configurations, each with a set of suggested hyperparameters designed for the NVIDIA DGX SuperPOD. This system comes equipped with eight NVIDIA A100 80GB GPUs. The configurations for the curated models can be found in the conf/peft/neva directory. You can access and modify the parameters to adjust the hyperparameters for your specific training runs. By customizing these settings, you can tailor the model’s performance and training efficiency to better suit your needs and requirements.

Language Model	Vision Encoder	Multimodal Connector Type	PEFT Scheme	Tensor Model Parallel Size	Pipeline Model Parallel Size	Batch size per GPU	Accumulated Global Batch Size	Precision	AMP Level	Total Training Samples Seen
LLaMA-2-7B-Chat (frozen)	CLIP-L-336px (frozen)	MLP Layers (trainable)	LORA	4	1	4	128	BF16	O2	150K
LLaMA-2-13B-Chat (frozen)	CLIP-L-336px (frozen)	MLP Layers (trainable)	LORA	8	1	4	128	BF16	O2	150K
LLaMA-2-70B-Chat (frozen)	CLIP-L-336px (frozen)	MLP Layers (trainable)	LORA	8	1	1	128	BF16	O2	150K
LLaMA-3-8B-Chat (frozen)	CLIP-L-336px (frozen)	MLP Layers (trainable)	LORA	8	1	4	128	BF16	O2	150K
LLaMA-3-70B-Chat (frozen)	CLIP-L-336px (frozen)	MLP Layers (trainable)	LORA	8	1	1	128	BF16	O2	150K
Mistral-7b-Instruct-v0.1 (frozen)	CLIP-L-336px (frozen)	MLP Downsample (trainable)	LORA	4	1	4	128	BF16	O2	150K
Mixtral-8x7b-Instruct-v0.1 (frozen)	CLIP-L-336px (frozen)	MLP Downsample (trainable)	LORA	8	1	2	128	BF16	O2	150K

Enable Parameter-Efficient Fine-Tuning

To enable the PEFT stage with a NeVA model, configure the configuration files:

In the defaults section of conf/config.yaml, update the peft field to point to the NeVA configuration file you want. For example, if you want to fine-tune a pretrained NeVA model based on LLaMA-2-7B-Chat (i.e. llama2_7b_chat) configuration, change the peft field to neva/llama2_7b_chat.
```
defaults:
  - peft: neva/llama2_7b_chat
  ...
```
In the stages field of conf/config.yaml, make sure the peft stage is included. For example,
```
stages:
  - peft
  ...
```
Execute the launcher pipeline: python3 main.py.

Additional Guidelines for Parameter-Efficient Fine-Tuning

Prior to initiating your PEFT, ensure you’ve readied all necessary datasets and checkpoints.
To load a pretrained checkpoint for PEFT, set the restore_from_path field in the model section to the path of the pretrained checkpoint in .nemo format. By default, this field links to the .nemo format checkpoint located in the training checkpoints folder.
PEFT-tuned checkpoints will save only the LoRA weights instead of the entire model. For subsequent inference and evaluation, both sets of weights will be required.
If you are training using the Vicuna v1.5 language model checkpoints, you can utilize the same model size configuration as in Llama2 Chat, since they are structurally identical. For instance, when using the Vicuna v1.5 7B model, you can simply opt for the llama2_7b_chat configuration. You only need to set the following: peft.model.mm_cfg.llm.model_type=v1 and peft.model.data.conv_template=v1.