Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Fine-tuning#
We provide a predefined fine-tuning configuration for the ViT B/16 model on ImageNet-1K, which can be found in
the conf/fine_tuning/imagenet1k.yaml file. The following table highlights the key differences between ViT pretraining
and fine-tuning:
Aspect |
ViT Pretraining |
ViT Fine-tuning |
|---|---|---|
Configuration Folder |
|
|
Training Samples Seen |
400M |
10M |
Optimizer |
Fused AdamW |
SGD |
Resolution |
224x224 |
384x384 |
Classification Head |
MLP with one hidden layer |
MLP with single linear layer |
To enable the fine-tuning stage with a ViT model, configure the configuration files:
In the
defaultssection ofconf/config.yaml, update thefine_tuningfield to point to the desired ViT configuration file. For example, if you want to use thevit/imagenet1kconfiguration, change thefine_tuningfield tovit/imagenet1k.defaults: - fine_tuning: vit/imagenet1k ...
In the
stagesfield ofconf/config.yaml, make sure thefine_tuningstage is included. For example,stages: - fine_tuning ...
Execute the launcher pipeline:
python3 main.py.
Remarks: To load a pretrained checkpoint for fine-tuning, set the restore_from_path field in the model section
to the path of the pretrained checkpoint in .nemo format. By default, this field links to the .nemo format
checkpoint located in the training checkpoints folder.