EfficientDet (TF2) with TAO Deploy
TF2 EfficientDet ONNX model generated from export
is taken as an input to tao deploy
to generate
optimized TensorRT engine. For more information about training the TF2 EfficientDet, please refer to
TF2 EfficientDet training documentation.
Same spec file can be used with the tao model efficientdet_tf2 export
command.
GenTrtEngine Config
The gen_trt_engine
configuration contains the parameters of exporting a .onnx
model to TensorRT engine, which can be used for deployment.
Field | Description | Data Type and Constraints | Recommended/Typical Value |
onnx_file | The path to the exported .onnx model | string | |
trt_engine | The path where the generated engine will be stored | string | |
results_dir | Directory to save the output log. If not specified log will be saved under global $results_dir/gen_trt_engine | string | |
tensorrt | TensorRT config | Dict |
The tensorrt
configuration contains specification of the TensorRT engine and calibration requirements.
+——————————+———————————————————————-+——————————-+——————————-+
| Field | Description | Data Type and Constraints | Recommended/Typical Value |
+——————————+———————————————————————-+——————————-+——————————-+
| data_type | The precision to be used for the TensorRT engine | string | FP32 |
+——————————+———————————————————————-+——————————-+——————————-+
| min_batch_size | The minimum batch size used for optimization profile shape | unsigned int | 1 |
+——————————+———————————————————————-+——————————-+——————————-+
| opt_batch_size | The optimal batch size used for optimization profile shape | unsigned int | 1 |
+——————————+———————————————————————-+——————————-+——————————-+
| max_batch_size | The maximum batch size used for optimization profile shape | unsigned int | 1 |
+——————————+———————————————————————-+——————————-+——————————-+
| max_workspace_size | The maximum workspace size for the TensorRT engine | unsigned int | 2 |
+——————————+———————————————————————-+——————————-+——————————-+
| calibration | Calibration config | Dict | |
+——————————+———————————————————————-+——————————-+——————————-+
The calibration
configuration specifies the location of the calibration data and where to save the calibration cache file.
+——————————+———————————————————————-+——————————-+——————————-+
| Field | Description | Data Type and Constraints | Recommended/Typical Value |
+——————————+———————————————————————-+——————————-+——————————-+
| cal_image_dir | The directory containing images to be used for calibration | string | False |
+——————————+———————————————————————-+——————————-+——————————-+
| cal_cache_file | The path to calibration cache file | string | False |
+——————————+———————————————————————-+——————————-+——————————-+
| cal_batches | The number of batches to be iterated for calibration | unsigned int | 10 |
+——————————+———————————————————————-+——————————-+——————————-+
| cal_batch_size | The batch size for each batch | unsigned int | 1 |
+——————————+———————————————————————-+——————————-+——————————-+
Below is a sample spec file for TF2 EfficientDet.
dataset:
augmentation:
rand_hflip: True
random_crop_min_scale: 0.1
random_crop_max_scale: 2
loader:
prefetch_size: 4
shuffle_file: False
shuffle_buffer: 10000
cycle_length: 32
block_length: 16
max_instances_per_image: 100
skip_crowd_during_training: True
num_classes: 91
train_tfrecords:
- '/data/train-*'
val_tfrecords:
- '/data/val-*'
val_json_file: '/data/annotations/instances_val2017.json'
train:
optimizer:
name: 'sgd'
momentum: 0.9
lr_schedule:
name: 'cosine'
warmup_epoch: 5
warmup_init: 0.0001
learning_rate: 0.2
amp: True
checkpoint: ''
num_examples_per_epoch: 100
moving_average_decay: 0.999
batch_size: 20
checkpoint_interval: 5
l2_weight_decay: 0.00004
l1_weight_decay: 0.0
clip_gradients_norm: 10.0
image_preview: True
qat: False
random_seed: 42
pruned_model_path: ''
num_epochs: 20
model:
name: 'efficientdet-d0'
input_width: 512
input_height: 512
aspect_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]'
anchor_scale: 4
min_level: 3
max_level: 7
num_scales: 3
freeze_bn: False
freeze_blocks: []
evaluate:
batch_size: 8
num_samples: 500
max_detections_per_image: 100
label_map: "/data/coco_labels.yaml"
trt_engine: "/output/efficientdet-d0.fp32.engine"
checkpoint: '/weights/efficientdet-d0_100.tlt'
export:
batch_size: 1
dynamic_batch_size: True
min_score_thresh: 0.4
checkpoint: '/weights/efficientdet-d0_100.tlt'
onnx_file: "/output/efficientdet-d0.onnx"
gen_trt_engine:
onnx_file: "/output/efficientdet-d0.onnx"
trt_engine: "/output/efficientdet-d0.fp32.engine"
tensorrt:
data_type: "fp32"
max_workspace_size: 2 # in Gb
calibration:
cal_image_dir: "/data/raw-data/val2017"
cal_cache_file: "EXPORTDIR/efficientdet-d0.cal"
cal_batch_size: 16
cal_batches: 10
inference:
checkpoint: '/weights/efficientdet-d0_100.tlt'
trt_engine: "/output/efficientdet-d0.fp32.engine"
image_dir: "/data/test_samples"
dump_label: False
batch_size: 1
min_score_thresh: 0.4
label_map: "/data/coco_labels.yaml"
results_dir: '/results'
Use the following command to run TF2 EfficientDet engine generation:
tao deploy efficientdet_tf2 gen_trt_engine -e /path/to/spec.yaml \
export.onnx_path=/path/to/onnx/file \
export.trt_engine=/path/to/engine/file \
export.tensorrt.data_type=<data_type>
Required Arguments
-e, --experiment_spec
: The experiment spec file to set up the TensorRT engine generation. This should be the same as the export specification file.
Optional Arguments
-h, --help
: Show this help message and exit.-k, --key
: A user-specific encoding key to load a.etlt
model.-r, --results_dir
: A global result directory where the experiment outputs and log would be written under<task>
subdirectory.
Sample Usage
Here’s an example of using the gen_trt_engine
command to generate INT8 TensorRT engine:
tao deploy efficientdet_tf2 gen_trt_engine -e $DEFAULT_SPEC
export.onnx_path=$ETLT_FILE \
export.trt_engine=$ENGINE_FILE \
export.tensorrt.data_type=fp16
Same spec file as TAO evaluation spec file.
Use the following command to run TF2 EfficientDet engine evaluation:
tao deploy efficientdet_tf2 evaluate -e /path/to/spec.yaml
Required Arguments
-e, --experiment_spec
: The experiment spec file for evaluation. This should be the same as the tao evaluate specification file.
Optional Arguments
-h, --help
: Show this help message and exit.-k, --key
: A user-specific encoding key to load a.etlt
model.-r, --results_dir
: A global result directory where the experiment outputs and log would be written under<task>
subdirectory.
Sample Usage
Here’s an example of using the evaluate
command to run evaluation with the TensorRT engine:
tao deploy efficientdet_tf2 evaluate -e $DEFAULT_SPEC
evaluate.trt_engine=$ENGINE_FILE \
evaluate.results_dir=$RESULTS_DIR
Same spec file as TAO inference spec file.
Use the following command to run TF2 EfficientDet engine inference:
tao deploy efficientdet_tf2 inference -e /path/to/spec.yaml \
inference.trt_engine=/path/to/engine/file \
inference.results_dir=/path/to/outputs
Required Arguments
-e, --experiment_spec
: The experiment spec file for inference. This should be the same as the tao inference specification file.
Optional Arguments
-h, --help
: Show this help message and exit.-k, --key
: A user-specific encoding key to load a.etlt
model.-r, --results_dir
: A global result directory where the experiment outputs and log would be written under<task>
subdirectory.
Sample Usage
Here’s an example of using the inference
command to run inference with the TensorRT engine:
tao deploy efficientdet_tf2 inference -e $DEFAULT_SPEC
inference.trt_engine=$ENGINE_FILE \
inference.results_dir=$RESULTS_DIR
The visualization will be stored under $RESULTS_DIR/images_annotated
and KITTI format predictions will be stored under $RESULTS_DIR/labels
.