TAO Toolkit v5.3.0
NVIDIA TAO v5.3.0

EfficientDet (TF2) with TAO Deploy

TF2 EfficientDet ONNX model generated from export is taken as an input to tao deploy to generate optimized TensorRT engine. For more information about training the TF2 EfficientDet, please refer to TF2 EfficientDet training documentation.

Same spec file can be used with the tao model efficientdet_tf2 export command.

GenTrtEngine Config

The gen_trt_engine configuration contains the parameters of exporting a .onnx model to TensorRT engine, which can be used for deployment.

Field Description Data Type and Constraints Recommended/Typical Value
onnx_file The path to the exported .onnx model string
trt_engine The path where the generated engine will be stored string
results_dir Directory to save the output log. If not specified log will be saved under global $results_dir/gen_trt_engine string
tensorrt TensorRT config Dict

The tensorrt configuration contains specification of the TensorRT engine and calibration requirements. +——————————+———————————————————————-+——————————-+——————————-+ | Field | Description | Data Type and Constraints | Recommended/Typical Value | +——————————+———————————————————————-+——————————-+——————————-+ | data_type | The precision to be used for the TensorRT engine | string | FP32 | +——————————+———————————————————————-+——————————-+——————————-+ | min_batch_size | The minimum batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | opt_batch_size | The optimal batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | max_batch_size | The maximum batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | max_workspace_size | The maximum workspace size for the TensorRT engine | unsigned int | 2 | +——————————+———————————————————————-+——————————-+——————————-+ | calibration | Calibration config | Dict | | +——————————+———————————————————————-+——————————-+——————————-+

The calibration configuration specifies the location of the calibration data and where to save the calibration cache file. +——————————+———————————————————————-+——————————-+——————————-+ | Field | Description | Data Type and Constraints | Recommended/Typical Value | +——————————+———————————————————————-+——————————-+——————————-+ | cal_image_dir | The directory containing images to be used for calibration | string | False | +——————————+———————————————————————-+——————————-+——————————-+ | cal_cache_file | The path to calibration cache file | string | False | +——————————+———————————————————————-+——————————-+——————————-+ | cal_batches | The number of batches to be iterated for calibration | unsigned int | 10 | +——————————+———————————————————————-+——————————-+——————————-+ | cal_batch_size | The batch size for each batch | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+

Below is a sample spec file for TF2 EfficientDet.

Copy
Copied!
            

dataset: augmentation: rand_hflip: True random_crop_min_scale: 0.1 random_crop_max_scale: 2 loader: prefetch_size: 4 shuffle_file: False shuffle_buffer: 10000 cycle_length: 32 block_length: 16 max_instances_per_image: 100 skip_crowd_during_training: True num_classes: 91 train_tfrecords: - '/data/train-*' val_tfrecords: - '/data/val-*' val_json_file: '/data/annotations/instances_val2017.json' train: optimizer: name: 'sgd' momentum: 0.9 lr_schedule: name: 'cosine' warmup_epoch: 5 warmup_init: 0.0001 learning_rate: 0.2 amp: True checkpoint: '' num_examples_per_epoch: 100 moving_average_decay: 0.999 batch_size: 20 checkpoint_interval: 5 l2_weight_decay: 0.00004 l1_weight_decay: 0.0 clip_gradients_norm: 10.0 image_preview: True qat: False random_seed: 42 pruned_model_path: '' num_epochs: 20 model: name: 'efficientdet-d0' input_width: 512 input_height: 512 aspect_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]' anchor_scale: 4 min_level: 3 max_level: 7 num_scales: 3 freeze_bn: False freeze_blocks: [] evaluate: batch_size: 8 num_samples: 500 max_detections_per_image: 100 label_map: "/data/coco_labels.yaml" trt_engine: "/output/efficientdet-d0.fp32.engine" checkpoint: '/weights/efficientdet-d0_100.tlt' export: batch_size: 1 dynamic_batch_size: True min_score_thresh: 0.4 checkpoint: '/weights/efficientdet-d0_100.tlt' onnx_file: "/output/efficientdet-d0.onnx" gen_trt_engine: onnx_file: "/output/efficientdet-d0.onnx" trt_engine: "/output/efficientdet-d0.fp32.engine" tensorrt: data_type: "fp32" max_workspace_size: 2 # in Gb calibration: cal_image_dir: "/data/raw-data/val2017" cal_cache_file: "EXPORTDIR/efficientdet-d0.cal" cal_batch_size: 16 cal_batches: 10 inference: checkpoint: '/weights/efficientdet-d0_100.tlt' trt_engine: "/output/efficientdet-d0.fp32.engine" image_dir: "/data/test_samples" dump_label: False batch_size: 1 min_score_thresh: 0.4 label_map: "/data/coco_labels.yaml" results_dir: '/results'

Use the following command to run TF2 EfficientDet engine generation:

Copy
Copied!
            

tao deploy efficientdet_tf2 gen_trt_engine -e /path/to/spec.yaml \ export.onnx_path=/path/to/onnx/file \ export.trt_engine=/path/to/engine/file \ export.tensorrt.data_type=<data_type>


Required Arguments

  • -e, --experiment_spec: The experiment spec file to set up the TensorRT engine generation. This should be the same as the export specification file.

Optional Arguments

  • -h, --help: Show this help message and exit.

  • -k, --key: A user-specific encoding key to load a .etlt model.

  • -r, --results_dir: A global result directory where the experiment outputs and log would be written under <task> subdirectory.

Sample Usage

Here’s an example of using the gen_trt_engine command to generate INT8 TensorRT engine:

Copy
Copied!
            

tao deploy efficientdet_tf2 gen_trt_engine -e $DEFAULT_SPEC export.onnx_path=$ETLT_FILE \ export.trt_engine=$ENGINE_FILE \ export.tensorrt.data_type=fp16


Same spec file as TAO evaluation spec file.

Use the following command to run TF2 EfficientDet engine evaluation:

Copy
Copied!
            

tao deploy efficientdet_tf2 evaluate -e /path/to/spec.yaml

Required Arguments

  • -e, --experiment_spec: The experiment spec file for evaluation. This should be the same as the tao evaluate specification file.

Optional Arguments

  • -h, --help: Show this help message and exit.

  • -k, --key: A user-specific encoding key to load a .etlt model.

  • -r, --results_dir: A global result directory where the experiment outputs and log would be written under <task> subdirectory.

Sample Usage

Here’s an example of using the evaluate command to run evaluation with the TensorRT engine:

Copy
Copied!
            

tao deploy efficientdet_tf2 evaluate -e $DEFAULT_SPEC evaluate.trt_engine=$ENGINE_FILE \ evaluate.results_dir=$RESULTS_DIR


Same spec file as TAO inference spec file.

Use the following command to run TF2 EfficientDet engine inference:

Copy
Copied!
            

tao deploy efficientdet_tf2 inference -e /path/to/spec.yaml \ inference.trt_engine=/path/to/engine/file \ inference.results_dir=/path/to/outputs

Required Arguments

  • -e, --experiment_spec: The experiment spec file for inference. This should be the same as the tao inference specification file.

Optional Arguments

  • -h, --help: Show this help message and exit.

  • -k, --key: A user-specific encoding key to load a .etlt model.

  • -r, --results_dir: A global result directory where the experiment outputs and log would be written under <task> subdirectory.

Sample Usage

Here’s an example of using the inference command to run inference with the TensorRT engine:

Copy
Copied!
            

tao deploy efficientdet_tf2 inference -e $DEFAULT_SPEC inference.trt_engine=$ENGINE_FILE \ inference.results_dir=$RESULTS_DIR

The visualization will be stored under $RESULTS_DIR/images_annotated and KITTI format predictions will be stored under $RESULTS_DIR/labels.

Previous EfficientDet (TF1) with TAO Deploy
Next Faster RCNN with TAO Deploy
© Copyright 2023, NVIDIA.. Last updated on Aug 26, 2024.