Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

Depth Pruning

Drop Model Layers

To trim the model layers, use the following script:

python -m torch.distributed.launch --nproc_per_node=<tensor_model_parallel_size> * <pipeline_model_parallel_size> \
  /NeMo/examples/nlp/language_modeling/megatron_gpt_drop_layers.py \
    --path_to_nemo /path/to/model.nemo \
    --path_to_save /path/to/save/trimmed_model.nemo \
    --tensor_model_parallel_size <tensor_model_parallel_size> \
    --pipeline_model_parallel_size <pipeline_model_parallel_size> \
    --gpus_per_node <gpus_per_node>  \
    --drop_layers 1 2 3 4

Note: layer indices start from 1.

To save trimmed model in zarr checkpoint format, add the following flag to the command above:

--zarr

Note: the zarr checkpoint format is deprecated.

Validate Trimmed Model

To validate the trimmed model, use the following script:

python /NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py \
  --config-path=/path/to/folder/with/model/config \
  --config-name=model_config.yaml \
  trainer.limit_val_batches=<limit_val_batches> \
  model.restore_from_path=/path/to/trimmed_model.nemo \
  model.skip_train=True \
  model.data.data_impl=mock \
  model.data.data_prefix=[]

To use a specific dataset instead of a mock dataset, modify the model.data parameters as follows:

model.data.data_impl=mmap \
model.data.data_prefix=["path/to/datafile1", "path/to/datafile2"]

Validate Original Model

To validate the original model without specific layers, use the following script:

python /NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py \
  --config-path=/path/to/folder/with/model/config \
  --config-name=model_config.yaml \
  trainer.limit_val_batches=<limit_val_batches> \
  model.restore_from_path=/path/to/original_model.nemo \
  model.skip_train=True \
  model.data.data_impl=mock \
  model.data.data_prefix=[] \
  model.drop_layers=[1,2,3,4]