NVIDIA TAO Toolkit v2.0
NVIDIA TAO Release tlt.20

Pruning the Model

Pruning removes parameters from the model to reduce the model size without compromising the integrity of the model itself using the tlt-prune command. Currently tlt-prune doesn’t support MaskRCNN models.

The tlt-prune command includes these parameters:

Copy
Copied!
            

tlt-prune [-h] -pm <pretrained_model> -o <output_file> -k <key> [-n <normalizer>] [-eq <equalization_criterion>] [-pg <pruning_granularity>] [-pth <pruning threshold>] [-nf <min_num_filters>] [-el [<excluded_list>]

  • -pm, --pretrained_model: Path to pretrained model

  • -o, --output_file: Path to output checkpoints

  • -k, --ke: Key to load a .tlt model

  • -h, --help: Show this help message and exit.

  • -n, –normalizer: max to normalize by dividing each norm by the maximum norm within a layer; L2 to normalize by dividing by the L2 norm of the vector comprising all kernel norms. (default: max)

  • -eq, --equalization_criterion: Criteria to equalize the stats of inputs to an element wise op layer, or depth-wise convolutional layer. This parameter is useful for resnets and mobilenets. Options are arithmetic_mean,:code:geometric_mean, union, and intersection. (default: union)

  • -pg, -pruning_granularity: Number of filters to remove at a time (default:8)

  • -pth: Threshold to compare normalized norm against (default:0.1)

  • -nf, --min_num_filters: Minimum number of filters to keep per layer (default:16)

  • -el, --excluded_layers: List of excluded_layers. Examples: -i item1 item2 (default: [])

After pruning, the model needs to be retrained. See Re-training the Pruned Model for more details.

Here’s an example of using the tlt-prune command:

Copy
Copied!
            

tlt-prune -m /workspace/output/weights/resnet_003.tlt \ -o /workspace/output/weights/resnet_003_pruned.tlt \ -eq union \ -pth 0.7 -k $KEY

Once the model has been pruned, there might be a slight decrease in accuracy. This happens because some previously useful weights may have been removed. In order to regain the accuracy, NVIDIA recommends that you retrain this pruned model over the same dataset. To do this, use the tlt-train command as documented in Training the model, with an updated spec file that points to the newly pruned model as the pretrained model file.

Users are advised to turn off the regularizer in the training_config for detectnet to recover the accuracy when retraining a pruned model. You may do this by setting the regularizer type to NO_REG as mentioned here. All the other parameters may be retained in the spec file from the previous training.

For detectnet_v2, it is important to set the load_graph under model_config to true to import the pruned graph.

© Copyright 2020, NVIDIA. Last updated on Nov 18, 2020.