Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.
Fine-tuning
We provide a predefined fine-tuning configuration for the ViT B/16
model on ImageNet-1K, which can be found in
the conf/fine_tuning/imagenet1k.yaml
file. The following table highlights the key differences between ViT pretraining
and fine-tuning:
Aspect |
ViT Pretraining |
ViT Fine-tuning |
---|---|---|
Configuration Folder |
|
|
Training Samples Seen |
400M |
10M |
Optimizer |
Fused AdamW |
SGD |
Resolution |
224x224 |
384x384 |
Classification Head |
MLP with one hidden layer |
MLP with single linear layer |
To enable the fine-tuning stage with a ViT model, configure the configuration files:
In the
defaults
section ofconf/config.yaml
, update thefine_tuning
field to point to the desired ViT configuration file. For example, if you want to use thevit/imagenet1k
configuration, change thefine_tuning
field tovit/imagenet1k
.defaults: - fine_tuning: vit/imagenet1k ...
In the
stages
field ofconf/config.yaml
, make sure thefine_tuning
stage is included. For example,stages: - fine_tuning ...
Execute the launcher pipeline:
python3 main.py
.
Remarks: To load a pretrained checkpoint for fine-tuning, set the restore_from_path
field in the model
section
to the path of the pretrained checkpoint in .nemo
format. By default, this field links to the .nemo
format
checkpoint located in the training checkpoints folder.