MaskRCNN ======== .. _maskrcnn: MaskRCNN supports the following tasks: * train * evaluate * inference * export These tasks may be invoked from the TLT launcher using the following convention on the command line: .. code:: tlt mask_rcnn where :code:`args_per_subtask` are the command-line arguments required for a given subtask. Each of these subtasks are explained in detail below. Creating a Configuration File ----------------------------- .. _creating_a_configuration_file_maskrcnn: Below is a sample MaskRCNN spec file. It has three major components: top level experiment configs, :code:`data_config`, and :code:`maskrcnn_config`, explained below in detail. The format of the spec file is a protobuf text (prototxt) message and each of its fields can be either a basic data type or a nested message. The top level structure of the spec file is summarized in the table below. Here's a sample of the MaskRCNN spec file: .. code:: seed: 123 use_amp: False warmup_steps: 0 checkpoint: "/workspace/tlt-experiments/maskrcnn/pretrained_resnet50/tlt_instance_segmentation_vresnet50/resnet50.hdf5" learning_rate_steps: "[60000, 80000, 100000]" learning_rate_decay_levels: "[0.1, 0.02, 0.002]" total_steps: 120000 train_batch_size: 2 eval_batch_size: 4 num_steps_per_eval: 10000 momentum: 0.9 l2_weight_decay: 0.0001 l1_weight_decay: 0.0 warmup_learning_rate: 0.0001 init_learning_rate: 0.02 # pruned_model_path: "/workspace/tlt-experiments/maskrcnn/pruned_model/model.tlt" data_config{ image_size: "(832, 1344)" augment_input_data: True eval_samples: 500 training_file_pattern: "/workspace/tlt-experiments/data/train*.tfrecord" validation_file_pattern: "/workspace/tlt-experiments/data/val*.tfrecord" val_json_file: "/workspace/tlt-experiments/data/annotations/instances_val2017.json" # dataset specific parameters num_classes: 91 skip_crowd_during_training: True } maskrcnn_config { nlayers: 50 arch: "resnet" freeze_bn: True freeze_blocks: "[0,1]" gt_mask_size: 112 # Region Proposal Network rpn_positive_overlap: 0.7 rpn_negative_overlap: 0.3 rpn_batch_size_per_im: 256 rpn_fg_fraction: 0.5 rpn_min_size: 0. # Proposal layer. batch_size_per_im: 512 fg_fraction: 0.25 fg_thresh: 0.5 bg_thresh_hi: 0.5 bg_thresh_lo: 0. # Faster-RCNN heads. fast_rcnn_mlp_head_dim: 1024 bbox_reg_weights: "(10., 10., 5., 5.)" # Mask-RCNN heads. include_mask: True mrcnn_resolution: 28 # training train_rpn_pre_nms_topn: 2000 train_rpn_post_nms_topn: 1000 train_rpn_nms_threshold: 0.7 # evaluation test_detections_per_image: 100 test_nms: 0.5 test_rpn_pre_nms_topn: 1000 test_rpn_post_nms_topn: 1000 test_rpn_nms_thresh: 0.7 # model architecture min_level: 2 max_level: 6 num_scales: 1 aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]" anchor_scale: 8 # localization loss rpn_box_loss_weight: 1.0 fast_rcnn_box_loss_weight: 1.0 mrcnn_weight_loss_mask: 1.0 } +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | **Field** | **Description** | **Data Type and Constraints** | **Recommended/Typical Value** | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | seed | The random seed for the experiment | Unsigned int | 123 | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | warmup_steps | The steps taken for learning rate to ramp up to the init_learning_rate | Unsigned int | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | warmup_learning_rate | The initial learning rate during the warmup phase | float | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | learning_rate_steps | A list of steps at which the learning rate decays by the factor specified | string | -- | | | in learning_rate_decay_levels | | | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | learning_rate_decay_levels | A list of decay factors. The length should match the length of | string | -- | | | learning_rate_steps. | | | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | total_steps | The total number of training iterations | Unsigned int | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | train_batch_size | The batch size during training | Unsigned int | 4 | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | eval_batch_size | The batch size during validation or evaluation | Unsigned int | 8 | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | num_steps_per_eval | Save a checkpoint and run evaluation every N steps. | Unsigned int | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | momentum | Momentum of the SGD optimizer | float | 0.9 | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | l1_weight_decay | L1 weight decay | float | 0.0001 | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | l2_weight_decay | L2 weight decay | float | 0.0001 | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | use_amp | Specifies whether to use Automatic Mixed Precision training | boolean | False | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | checkpoint | The path to a pretrained model | string | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | maskrcnn_config | The architecture of the model | message | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | data_config | The input data configuration | message | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | skip_checkpoint_variables | If specified, the weights of the layers with matching regular expressions will | string | -- | | | not be loaded. This is especially helpful for transfer learning. | | | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ | pruned_model_path | The path to a pruned MaskRCNN model | string | -- | +----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+ .. Note:: When using :code:`skip_checkpoint_variables`, you can first find the model structure in the training log (Part of the MaskRCNN+ResNet50 model structure is shown below). If, for example, you want to retrain all prediction heads, you can set :code:`skip_checkpoint_variables` to “head”. TLT uses the Python re library to check whether “head” matches any layer name or :code:`re.search($skip_checkpoint_variables, $layer_name)`. .. code:: [MaskRCNN] INFO : ================ TRAINABLE VARIABLES ================== [MaskRCNN] INFO : [#0001] conv1/kernel:0 => (7, 7, 3, 64) [MaskRCNN] INFO : [#0002] bn_conv1/gamma:0 => (64,) [MaskRCNN] INFO : [#0003] bn_conv1/beta:0 => (64,) [MaskRCNN] INFO : [#0004] block_1a_conv_1/kernel:0 => (1, 1, 64, 64) [MaskRCNN] INFO : [#0005] block_1a_bn_1/gamma:0 => (64,) [MaskRCNN] INFO : [#0006] block_1a_bn_1/beta:0 => (64,) [MaskRCNN] INFO : [#0007] block_1a_conv_2/kernel:0 => (3, 3, 64, 64) [MaskRCNN] INFO : [#0008] block_1a_bn_2/gamma:0 => (64,) [MaskRCNN] INFO : [#0009] block_1a_bn_2/beta:0 => (64,) [MaskRCNN] INFO : [#0010] block_1a_conv_3/kernel:0 => (1, 1, 64, 256) [MaskRCNN] INFO : [#0011] block_1a_bn_3/gamma:0 => (256,) [MaskRCNN] INFO : [#0012] block_1a_bn_3/beta:0 => (256,) [MaskRCNN] INFO : [#0110] block_3d_bn_3/gamma:0 => (1024,) [MaskRCNN] INFO : [#0111] block_3d_bn_3/beta:0 => (1024,) [MaskRCNN] INFO : [#0112] block_3e_conv_1/kernel:0 => (1, 1, 1024, [MaskRCNN] INFO : [#0144] block_4b_bn_1/beta:0 => (512,) … … … … ... [MaskRCNN] INFO : [#0174] post_hoc_d5/kernel:0 => (3, 3, 256, 256) [MaskRCNN] INFO : [#0175] post_hoc_d5/bias:0 => (256,) [MaskRCNN] INFO : [#0176] rpn/kernel:0 => (3, 3, 256, 256) [MaskRCNN] INFO : [#0177] rpn/bias:0 => (256,) [MaskRCNN] INFO : [#0178] rpn-class/kernel:0 => (1, 1, 256, 3) [MaskRCNN] INFO : [#0179] rpn-class/bias:0 => (3,) [MaskRCNN] INFO : [#0180] rpn-box/kernel:0 => (1, 1, 256, 12) [MaskRCNN] INFO : [#0181] rpn-box/bias:0 => (12,) [MaskRCNN] INFO : [#0182] fc6/kernel:0 => (12544, 1024) [MaskRCNN] INFO : [#0183] fc6/bias:0 => (1024,) [MaskRCNN] INFO : [#0184] fc7/kernel:0 => (1024, 1024) [MaskRCNN] INFO : [#0185] fc7/bias:0 => (1024,) [MaskRCNN] INFO : [#0186] class-predict/kernel:0 => (1024, 91) [MaskRCNN] INFO : [#0187] class-predict/bias:0 => (91,) [MaskRCNN] INFO : [#0188] box-predict/kernel:0 => (1024, 364) [MaskRCNN] INFO : [#0189] box-predict/bias:0 => (364,) [MaskRCNN] INFO : [#0190] mask-conv-l0/kernel:0 => (3, 3, 256, 256) [MaskRCNN] INFO : [#0191] mask-conv-l0/bias:0 => (256,) [MaskRCNN] INFO : [#0192] mask-conv-l1/kernel:0 => (3, 3, 256, 256) [MaskRCNN] INFO : [#0193] mask-conv-l1/bias:0 => (256,) [MaskRCNN] INFO : [#0194] mask-conv-l2/kernel:0 => (3, 3, 256, 256) [MaskRCNN] INFO : [#0195] mask-conv-l2/bias:0 => (256,) [MaskRCNN] INFO : [#0196] mask-conv-l3/kernel:0 => (3, 3, 256, 256) [MaskRCNN] INFO : [#0197] mask-conv-l3/bias:0 => (256,) [MaskRCNN] INFO : [#0198] conv5-mask/kernel:0 => (2, 2, 256, 256) [MaskRCNN] INFO : [#0199] conv5-mask/bias:0 => (256,) [MaskRCNN] INFO : [#0200] mask_fcn_logits/kernel:0 => (1, 1, 256, 91) [MaskRCNN] INFO : [#0201] mask_fcn_logits/bias:0 => (91,) MaskRCNN Config ^^^^^^^^^^^^^^^ The MaskRCNN configuration (:code:`maskrcnn_config`) defines the model structure. This model is used for training, evaluation, and inference. A detailed description is included in the table below. Currently, MaskRCNN only supports ResNet10/18/34/50/101 as its backbone. +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | **Field** | **Description** | **Data Type and Constraints** | **Recommended/Typical Value** | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | nlayers | The number of layers in ResNet arch | message | 50 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | arch | The backbone feature extractor name | string | resnet | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | freeze_bn | Whether to freeze all BatchNorm layers in the backbone | boolean | False | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | freeze_blocks | A list of conv blocks in the backbone to freeze | string | -- | | | | | | | | | ResNet: For the ResNet series, the block IDs | | | | | valid for freezing are any subset of | | | | | [0, 1, 2, 3] (inclusive) | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | gt_mask_size | The groundtruth mask size | Unsigned int | 112 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | rpn_positive_overlap | The lower-bound threshold to assign positive labels for anchors | float | 0.7 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | rpn_negative_overlap | The upper-bound threshold to assign negative labels for anchors | float | 0.3 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | rpn_batch_size_per_im | The number of sampled anchors per image in RPN | Unsigned int | 256 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | rpn_fg_fraction | The desired fraction of positive anchors in a batch | Unsigned int | 0.5 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | rpn_min_size | The minimum proposal height and width | | 0 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | batch_size_per_im | The RoI minibatch size per image | Unsigned int | 512 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | fg_fraction | The target fraction of RoI minibatch that is labeled as | float | 0.25 | | | foreground | | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | fast_rcnn_mlp_head_dim | The Fast-RCNN classification head dimension | Unsigned int | 1024 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | bbox_reg_weights | The bounding-box regularization weights | string | “(10, 10, 5, 5)” | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | include_mask | Specifies whether to include a mask head | boolean | True | | | | | | | | | | (currently only True is supported) | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | mrcnn_resolution | The mask-head resolution | Unsigned int | 28 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | train_rpn_pre_nms_topn | The number of top-scoring RPN proposals to keep before applying | Unsigned int | 2000 | | | NMS (per FPN level) during training | | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | train_rpn_post_nms_topn | The number of top-scoring RPN proposals to keep after applying NMS | Unsigned int | 1000 | | | (total number produced) during training | | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | train_rpn_nms_threshold | The NMS IOU threshold in RPN during training | float | 0.7 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | test_detections_per_image | The number of bounding box candidates after NMS | Unsigned int | 100 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | test_nms | The NMS IOU threshold during test | float | 0.5 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | test_rpn_pre_nms_topn | The number of top-scoring RPN proposals to keep before applying NMS | Unsigned int | 1000 | | | (per FPN level) during test | | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | test_rpn_post_nms_topn | The number of top scoring RPN proposals to keep after applying NMS | Unsigned int | 1000 | | | (total number produced) during test | | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | test_rpn_nms_threshold | The NMS IOU threshold in RPN during test | float | 0.7 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | min_level | The minimum level of the output feature pyramid | Unsigned int | 2 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | max_level | The maximum level of the output feature pyramid | Unsigned int | 6 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | num_scales | The number of anchor octave scales on each pyramid level (e.g. if | Unsigned int | 1 | | | set to 3, the anchor scales are [2^0, 2^(1/3), 2^(2/3)]) | | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | aspect_ratios | A list of tuples representing the aspect ratios of anchors on each | string | "[(1.0, 1.0), | | | pyramid level | | (1.4, 0.7), | | | | | (0.7, 1.4)]" | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | anchor_scale | Scale of the base-anchor size to the feature-pyramid stride | Unsigned int | 8 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | rpn_box_loss_weight | The weight for adjusting RPN box loss in the total loss | float | 1.0 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | fast_rcnn_box_loss_weight | The weight for adjusting FastRCNN box regression loss in the total | float | 1.0 | | | loss | | | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ | mrcnn_weight_loss_mask | The weight for adjusting mask loss in the total loss | float | 1.0 | +---------------------------+---------------------------------------------------------------------+--------------------------------------------------+------------------------------------+ .. Note:: The :code:`min_level`, :code:`max_level`, :code:`num_scales`, :code:`aspect_ratios`, and :code:`anchor_scale` are used to determine anchor generation for MaskRCNN. :code:`anchor_scale` is the base anchor scale, while :code:`min_level` and :code:`max_level` set the range of the scales on different feature maps. For example, the actual anchor scale for the feature map at :code:`min_level` will be :code:`anchor_scale * 2^min_level` and the actual anchor scale for the feature map at :code:`max_level` will be `anchor_scale * 2^max_level`. And it will generate anchors of different :code:`aspect_ratios` based on the actual anchor scale. Data Config ^^^^^^^^^^^ The data configuration (:code:`data_config`) specifies the input data source and format. This is used for training, evaluation, and inference. A detailed description is summarized in the table below. +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | **Field** | **Description** | **Data Type and Constraints** | **Recommended/Typical Value** | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | image_size | The image dimension as a tuple within quote marks. “(height, | string | “(832, 1344)” | | | width)” indicates the dimension of the resized and padded input. | | | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | augment_input_data | Specifies whether to augment the data | boolean | True | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | eval_samples | The number of samples for evaluation | Unsigned int | -- | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | training_file_pattern | The TFRecord path for training | string | -- | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | validation_file_pattern | The TFRecord path for validation | string | -- | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | val_json_file | The annotation file path for validation | string | -- | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | num_classes | The number of classes. If there are N categories in | Unsigned int | -- | | | the annotation, num_classes should be N+1 (background class) | | | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | skip_crowd_during_training | Specifies whether to skip crowd during training | boolean | True | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | prefetch_buffer_size | The prefetch buffer size used by tf.data.Dataset (default: AUTOTUNE) | Unsigned int | -- | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | shuffle_buffer_size | The shuffle buffer size used by tf.data.Dataset (default: 4096) | Unsigned int | 4096 | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ | n_workers | The number of workers to parse and preprocess data (default: 16) | Unsigned int | 16 | +----------------------------+----------------------------------------------------------------------+-------------------------------+-------------------------------+ .. Note:: If an out-of-memory error occurs during training, try to set a smaller :code:`image_size` or :code:`batch_size` first. If the error persists, try reducing the :code:`n_workers`, :code:`shuffle_buffer_size`, and :code:`prefetch_buffer_size` values. Lastly, if the original images have a very large resolution, resize the images offline and create new tfrecords to avoid loading large images to GPU memory. Training the Model ------------------ Train the MaskRCNN model using this command: .. code:: tlt mask_rcnn train [-h] -e -d -k [--gpus ] [--gpu_index ] [--log_file ] Required Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-d, --model_dir`: The path to the folder where the experiment output is written. * :code:`-k, --key`: The encryption key to decrypt the model. * :code:`-e, --experiment_spec_file`: The experiment specification file to set up the evaluation. experiment. This should be the same as the training specification file. Optional Arguments ^^^^^^^^^^^^^^^^^^ * :code:`--gpus num_gpus`: The number of GPUs to use and processes to launch for training. The default value is 1. * :code:`--gpu_index`: The index of the (descrete) GPU for exporting the model if the machine has multiple GPUs installed. Note that export can only run on a single GPU. * :code:`--log_file`: The path to the log file. The default path is :code:`stdout`. * :code:`-h, --help`: Show this help message and exit. Input Requirement ^^^^^^^^^^^^^^^^^ * **Input size**: C * W * H (where C = 3, W >= 128, H >= 128 and W, H are multiples of 2^ ``max_level``) * **Image format**: JPG * **Label format**: COCO detection Sample Usage ^^^^^^^^^^^^ Here's an example of using the :code:`train` command on a MaskRCNN model: .. code:: tlt mask_rcnn train --gpus 2 -e /path/to/spec.txt -d /path/to/result -k $KEY Evaluating the Model -------------------- To run evaluation for a MaskRCNN model, use this command: .. code:: tlt mask_rcnn evaluate [-h] -e -m -k [--gpu_index ] [--log_file ] Required Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-e, --experiment_spec_file`: The experiment spec file to set up the evaluation experiment. This should be the same as the training spec file. * :code:`-m, --model`: The path to the model file to use for evaluation * :code:`-k, --key`: The key to load the model. This argument is not required if :code:`-m` is followed by a TensorRT engine. Optional Arguments ^^^^^^^^^^^^^^^^^^ * :code:`--gpu_index`: The index of the (descrete) GPU for exporting the model if the machine has multiple GPUs installed. Note that export can only run on a single GPU. * :code:`--log_file`: The path to the log file. The default path is :code:`stdout`. * :code:`-h, --help`: Show this help message and exit. Pruning the Model ----------------- .. _pruning_the_model_mrcnn: Pruning removes parameters from the model to reduce the model size. Retraining is necessary to regain the performance of the unpruned model. The :code:`prune` command includes these parameters: .. code:: tlt mask_rcnn prune [-h] -m -o -k [-n ] [-eq ] [-pg ] [-pth ] [-nf ] [-el [] [--gpu_index ] [--log_file ] Required Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-m, --pretrained_model`: The path to the pretrained model. * :code:`-o, --output_dir`: The output directory which contains the pruned model, named as :code:`model.tlt`. * :code:`-k, --key`: The key to load a :code:`.tlt` model. Optional Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-h, --help`: Show this help message and exit. * :code:`-n, --normalizer`: ``max`` to normalize by dividing each norm by the maximum norm within a layer; ``L2`` to normalize by dividing by the L2 norm of the vector comprising all kernel norms. (default: `max`) * :code:`-eq, --equalization_criterion`: Criteria to equalize the stats of inputs to an element wise op layer, or depth-wise convolutional layer. This parameter is useful for resnets and mobilenets. Options are :code:`arithmetic_mean`, :code:`geometric_mean`, :code:`union`, and :code:`intersection`. (default: :code:`union`) * :code:`-pg, --pruning_granularity`: Number of filters to remove at a time. (default:8) * :code:`-pth`: Threshold to compare normalized norm against. (default:0.1) .. Note: NVIDIA recommends changing the threshold to keep the number of parameters in the model to within 10-20% of the original unpruned model. * :code:`-nf, --min_num_filters`: Minimum number of filters to keep per layer (default:16) * :code:`-el, --excluded_layers`: List of excluded_layers. Examples: -i item1 item2 (default: []) * :code:`--gpu_index`: The index of the GPU to run evaluation (useful when the machine has multiple GPUs installed). Note that evaluation can only run on a single GPU. * :code:`--log_file`: The path to the log file. Defaults to :code:`stdout`. Here's an example of using the :code:`prune` command: .. code:: tlt mask_rcnn prune -m /workspace/model.step-100.tlt -o /workspace/output -eq union -pth 0.7 -k $KEY After pruning, the model needs to be retrained first before it can be used for inference or evaluation. Re-training the Pruned Model ---------------------------- Once the model has been pruned, there might be a decrease in accuracy. This happens because some previously useful weights may have been removed. To regain accuracy, NVIDIA recommends that you retrain this pruned model over the same dataset. To do this, run the :code:`tlt mask_rcnn train` command with an updated spec file that points to the newly pruned model by setting :code:`pruned_model_path`. Users are advised to turn off the regularizer during retraining. You may do this by setting the regularizer weights to 0 for both :code:`l1_weight_decay` and :code:`l2_weight_decay`. The other parameters may be retained in the spec file from the previous training. :code:`train_batch_size` and :code:`eval_batch_size` must be kept unchanged. Running Inference on the Model ------------------------------ The :code:`inference` tool for MaskRCNN networks can be used to visualize bboxes or generate frame-by-frame COCO-format labels on a directory of images. Here's an example of using this tool: .. code:: tlt mask_rcnn inference [-h] -i -o -e -m -k [-l