Mask RCNN with TAO Deploy#

To generate an optimized TensorRT engine for Mask RCNN, the gen_trt_engine action takes a UFF or ONNX file previously produced by the Mask RCNN export action. For more information about training the Mask RCNN, refer to the Mask RCNN training documentation.

Converting .uff File into TensorRT Engine#

You can reuse the spec from the Mask RCNN export action as a starting point.

Note

When generating a TensorRT engine for a model trained with QAT enabled, the tensor scale factors defined by the calibration cache file are required. However, the current version of QAT does not natively support DLA int8 deployment on Jetson. To deploy this model on a Jetson with DLA int8, force post-training quantization to generate the calibration cache file.

Ask the agent to run the gen_trt_engine action against your spec. For example:

Build an INT8 TensorRT engine for Mask RCNN from the exported UFF at
``s3://my-bucket/mrcnn/mrcnn.uff`` using ``trt-spec.yaml``. Calibrate
against ``s3://my-bucket/mrcnn/cal-images/`` and write the engine to
``s3://my-bucket/mrcnn/int8.engine``. Run on local Docker.

Running Evaluation through TensorRT Engine#

The batch size used for evaluation is the same as the maximum batch size used during engine generation. The label file is derived from dataset_config.val_json_file in the specification file. Use the same specification file as the TAO evaluation specification file. The following is a sample specification file:

data_config{
    image_size: "(832, 1344)"
    augment_input_data: True
    eval_samples: 500
    training_file_pattern: "/workspace/tao-experiments/data/train*.tfrecord"
    validation_file_pattern: "/workspace/tao-experiments/data/val*.tfrecord"
    val_json_file: "/workspace/tao-experiments/data/raw-data/annotations/instances_val2017.json"

    # dataset specific parameters
    num_classes: 91
    skip_crowd_during_training: True
}
maskrcnn_config {
    nlayers: 50
    arch: "resnet"
    freeze_bn: True
    freeze_blocks: "[0,1]"
    gt_mask_size: 112

    # Region Proposal Network
    rpn_positive_overlap: 0.7
    rpn_negative_overlap: 0.3
    rpn_batch_size_per_im: 256
    rpn_fg_fraction: 0.5
    rpn_min_size: 0.

    # Proposal layer.
    batch_size_per_im: 512
    fg_fraction: 0.25
    fg_thresh: 0.5
    bg_thresh_hi: 0.5
    bg_thresh_lo: 0.

    # Faster-RCNN heads.
    fast_rcnn_mlp_head_dim: 1024
    bbox_reg_weights: "(10., 10., 5., 5.)"

    # Mask-RCNN heads.
    include_mask: True
    mrcnn_resolution: 28

    # training
    train_rpn_pre_nms_topn: 2000
    train_rpn_post_nms_topn: 1000
    train_rpn_nms_threshold: 0.7

    # evaluation
    test_detections_per_image: 100
    test_nms: 0.5
    test_rpn_pre_nms_topn: 1000
    test_rpn_post_nms_topn: 1000
    test_rpn_nms_thresh: 0.7

    # model architecture
    min_level: 2
    max_level: 6
    num_scales: 1
    aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]"
    anchor_scale: 8

    # localization loss
    rpn_box_loss_weight: 1.0
    fast_rcnn_box_loss_weight: 1.0
    mrcnn_weight_loss_mask: 1.0
}

Ask the agent to run the evaluate action against the engine you built. For example:

Evaluate the Mask RCNN TensorRT engine at
``s3://my-bucket/mrcnn/int8.engine`` against ``eval-spec.yaml``. Run on
local Docker.

Running Inference through TensorRT Engine#

The batch size used for inference is the same as the maximum batch size used during engine generation.

Ask the agent to run the inference action against the engine you built. For example:

Run Mask RCNN inference with the TensorRT engine at
``s3://my-bucket/mrcnn/int8.engine`` using ``infer-spec.yaml``. Run on
your chosen backend.

Annotated visualizations are written to images_annotated under the configured results directory, and COCO-format predictions are written to labels.