RetinaNet with TAO Deploy#
To generate an optimized TensorRT engine for RetinaNet, the gen_trt_engine action takes an
ONNX file previously produced by the RetinaNet export action. For more information about
training the RetinaNet, refer to the RetinaNet training documentation.
Converting an .onnx File into TensorRT Engine#
You can reuse the spec from the RetinaNet export action as a starting point.
Note
When generating a TensorRT engine for a model trained with QAT enabled, the tensor
scale factors defined by the calibration cache file are required. However, the current
version of QAT does not natively support DLA int8 deployment on Jetson. To deploy this model
on a Jetson with DLA int8, force post-training quantization to generate the calibration
cache file.
Ask the agent to run the gen_trt_engine action against your spec. For example:
Build an INT8 TensorRT engine for RetinaNet from the exported ONNX at
``s3://my-bucket/retinanet/model.onnx`` using ``trt-spec.yaml``.
Calibrate against ``s3://my-bucket/retinanet/cal-images/`` and write the
engine to ``s3://my-bucket/retinanet/int8.engine``. Run on the local Docker daemon.
Running Evaluation through TensorRT Engine#
Use the same specification file as the TAO evaluation specification file. The following is a sample specification file:
eval_config {
batch_size: 8
matching_iou_threshold: 0.5
}
nms_config {
confidence_threshold: 0.001
}
augmentation_config {
output_width: 1248
output_height: 384
output_channel: 3
}
dataset_config {
validation_data_sources: {
image_directory_path: "/workspace/tao-experiments/data/val/images"
label_directory_path: "/workspace/tao-experiments/data/val/labels"
}
image_extension: "png"
target_class_mapping {
key: "car"
value: "car"
}
target_class_mapping {
key: "pedestrian"
value: "pedestrian"
}
target_class_mapping {
key: "cyclist"
value: "cyclist"
}
target_class_mapping {
key: "van"
value: "car"
}
target_class_mapping {
key: "person_sitting"
value: "pedestrian"
}
validation_fold: 0
}
Ask the agent to run the evaluate action against the engine you built. For example:
Evaluate the RetinaNet TensorRT engine at
``s3://my-bucket/retinanet/int8.engine`` against ``eval-spec.yaml``. Run
on local Docker.
Running Inference through TensorRT Engine#
Ask the agent to run the inference action against the engine you built. For example:
Run RetinaNet inference with the TensorRT engine at
``s3://my-bucket/retinanet/int8.engine`` using ``infer-spec.yaml``. Run
on your chosen backend.
Annotated visualizations are written to images_annotated under the configured results
directory, and KITTI-format predictions are written to labels.