Mask RCNN with TAO Deploy#
To generate an optimized TensorRT engine for Mask RCNN, the gen_trt_engine action takes a
UFF or ONNX file previously produced by the Mask RCNN export action. For more information
about training the Mask RCNN, refer to the
Mask RCNN training documentation.
Converting .uff File into TensorRT Engine#
You can reuse the spec from the Mask RCNN export action as a starting point.
Note
When generating a TensorRT engine for a model trained with QAT enabled, the tensor
scale factors defined by the calibration cache file are required. However, the current
version of QAT does not natively support DLA int8 deployment on Jetson. To deploy this model
on a Jetson with DLA int8, force post-training quantization to generate the calibration
cache file.
Ask the agent to run the gen_trt_engine action against your spec. For example:
Build an INT8 TensorRT engine for Mask RCNN from the exported UFF at
``s3://my-bucket/mrcnn/mrcnn.uff`` using ``trt-spec.yaml``. Calibrate
against ``s3://my-bucket/mrcnn/cal-images/`` and write the engine to
``s3://my-bucket/mrcnn/int8.engine``. Run on local Docker.
Running Evaluation through TensorRT Engine#
The batch size used for evaluation is the same as the maximum batch size used during engine
generation. The label file is derived from dataset_config.val_json_file in the specification
file. Use the same specification file as the TAO evaluation specification file. The following
is a sample specification file:
data_config{
image_size: "(832, 1344)"
augment_input_data: True
eval_samples: 500
training_file_pattern: "/workspace/tao-experiments/data/train*.tfrecord"
validation_file_pattern: "/workspace/tao-experiments/data/val*.tfrecord"
val_json_file: "/workspace/tao-experiments/data/raw-data/annotations/instances_val2017.json"
# dataset specific parameters
num_classes: 91
skip_crowd_during_training: True
}
maskrcnn_config {
nlayers: 50
arch: "resnet"
freeze_bn: True
freeze_blocks: "[0,1]"
gt_mask_size: 112
# Region Proposal Network
rpn_positive_overlap: 0.7
rpn_negative_overlap: 0.3
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_min_size: 0.
# Proposal layer.
batch_size_per_im: 512
fg_fraction: 0.25
fg_thresh: 0.5
bg_thresh_hi: 0.5
bg_thresh_lo: 0.
# Faster-RCNN heads.
fast_rcnn_mlp_head_dim: 1024
bbox_reg_weights: "(10., 10., 5., 5.)"
# Mask-RCNN heads.
include_mask: True
mrcnn_resolution: 28
# training
train_rpn_pre_nms_topn: 2000
train_rpn_post_nms_topn: 1000
train_rpn_nms_threshold: 0.7
# evaluation
test_detections_per_image: 100
test_nms: 0.5
test_rpn_pre_nms_topn: 1000
test_rpn_post_nms_topn: 1000
test_rpn_nms_thresh: 0.7
# model architecture
min_level: 2
max_level: 6
num_scales: 1
aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]"
anchor_scale: 8
# localization loss
rpn_box_loss_weight: 1.0
fast_rcnn_box_loss_weight: 1.0
mrcnn_weight_loss_mask: 1.0
}
Ask the agent to run the evaluate action against the engine you built. For example:
Evaluate the Mask RCNN TensorRT engine at
``s3://my-bucket/mrcnn/int8.engine`` against ``eval-spec.yaml``. Run on
local Docker.
Running Inference through TensorRT Engine#
The batch size used for inference is the same as the maximum batch size used during engine generation.
Ask the agent to run the inference action against the engine you built. For example:
Run Mask RCNN inference with the TensorRT engine at
``s3://my-bucket/mrcnn/int8.engine`` using ``infer-spec.yaml``. Run on
your chosen backend.
Annotated visualizations are written to images_annotated under the configured results
directory, and COCO-format predictions are written to labels.