Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Drop Model Laeyrs#
To trim the model layers, use the following script:
python -m torch.distributed.launch --nproc_per_node=<tensor_model_parallel_size> * <pipeline_model_parallel_size> \
/NeMo/examples/nlp/language_modeling/megatron_gpt_drop_layers.py \
--path_to_nemo /path/to/model.nemo \
--path_to_save /path/to/save/trimmed_model.nemo \
--tensor_model_parallel_size <tensor_model_parallel_size> \
--pipeline_model_parallel_size <pipeline_model_parallel_size> \
--gpus_per_node <gpus_per_node> \
--drop_layers 1 2 3 4
Note: layer indices start from 1.
To save trimmed model in zarr
checkpoint format, add the following flag to the command above:
--zarr
Note: the zarr
checkpoint format is deprecated.
Validate Trimmed Model#
To validate the trimmed model, use the following script:
python /NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py \
--config-path=/path/to/folder/with/model/config \
--config-name=model_config.yaml \
trainer.limit_val_batches=<limit_val_batches> \
model.restore_from_path=/path/to/trimmed_model.nemo \
model.skip_train=True \
model.data.data_impl=mock \
model.data.data_prefix=[]
To use a specific dataset instead of a mock dataset, modify the model.data
parameters as follows:
model.data.data_impl=mmap \
model.data.data_prefix=["path/to/datafile1", "path/to/datafile2"]
Validate Original Model#
To validate the original model without specific layers, use the following script:
python /NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py \
--config-path=/path/to/folder/with/model/config \
--config-name=model_config.yaml \
trainer.limit_val_batches=<limit_val_batches> \
model.restore_from_path=/path/to/original_model.nemo \
model.skip_train=True \
model.data.data_impl=mock \
model.data.data_prefix=[] \
model.drop_layers=[1,2,3,4]