Pruning the Model
=================

.. _pruning_the_model:

Pruning removes parameters from the model to reduce the model size without compromising the
integrity of the model itself using the :code:`tlt-prune` command. Currently :code:`tlt-prune`
`doesn’t` support MaskRCNN models.

The :code:`tlt-prune` command includes these parameters:

.. code::

        tlt-prune [-h] -pm <pretrained_model>
                       -o <output_file> -k <key>
                       [-n <normalizer>]
                       [-eq <equalization_criterion>]
                       [-pg <pruning_granularity>]
                       [-pth <pruning threshold>]
                       [-nf <min_num_filters>]
                       [-el [<excluded_list>]
               
Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-pm, --pretrained_model`: Path to pretrained model
* :code:`-o, --output_file`: Path to output checkpoints
* :code:`-k, --ke`: Key to load a .tlt model

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: Show this help message and exit.
* :code:`-n, –normalizer`: ``max`` to normalize by dividing each norm by the maximum norm within
  a layer; ``L2`` to normalize by dividing by the L2 norm of the vector comprising all kernel norms.
  (default: `max`)
* :code:`-eq, --equalization_criterion`: Criteria to equalize the stats of inputs to an
  element wise op layer, or depth-wise convolutional layer. This parameter is useful for
  resnets and mobilenets. Options are :code:`arithmetic_mean`,:code:`geometric_mean`,
  :code:`union`, and :code:`intersection`. (default: :code:`union`)
* :code:`-pg, -pruning_granularity`: Number of filters to remove at a time (default:8)
* :code:`-pth`: Threshold to compare normalized norm against (default:0.1)

  .. Note: NVIDIA recommends changing the threshold to keep the number of parameters in the
     model to within 10-20% of the original unpruned model.

* :code:`-nf, --min_num_filters`: Minimum number of filters to keep per layer (default:16)
* :code:`-el, --excluded_layers`: List of excluded_layers. Examples: -i item1 item2 (default: [])

After pruning, the model needs to be retrained. See :ref:`Re-training the Pruned Model
<re-training_the_pruned_model>` for more details.

Using the Prune Command
^^^^^^^^^^^^^^^^^^^^^^^

Here's an example of using the :code:`tlt-prune` command:

.. code::

        tlt-prune -m /workspace/output/weights/resnet_003.tlt \
                             -o /workspace/output/weights/resnet_003_pruned.tlt \
                             -eq union \
                             -pth 0.7 -k $KEY

Re-training the Pruned Model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. _re-training_the_pruned_model:

Once the model has been pruned, there might be a slight decrease in accuracy. This happens
because some previously useful weights may have been removed. In order to regain the accuracy,
NVIDIA recommends that you retrain this pruned model over the same dataset. To do this, use
the :code:`tlt-train` command as documented in :ref:`Training the model <training_the_model>`, with
an updated spec file that points to the newly pruned model as the pretrained model file.

Users are advised to turn off the regularizer in the training_config for detectnet to recover
the accuracy when retraining a pruned model. You may do this by setting the regularizer type
to NO_REG as mentioned :ref:`here<trainer>`. All the other parameters may be retained in
the spec file from the previous training.

For detectnet_v2, it is important to set the :code:`load_graph` under :code:`model_config` to
:code:`true` to import the pruned graph.