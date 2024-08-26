Same spec file can be used with the tao model efficientdet_tf2 export command.

The gen_trt_engine configuration contains the parameters of exporting a .onnx model to TensorRT engine, which can be used for deployment.

Field Description Data Type and Constraints Recommended/Typical Value onnx_file The path to the exported .onnx model string trt_engine The path where the generated engine will be stored string results_dir Directory to save the output log. If not specified log will be saved under global $results_dir/gen_trt_engine string tensorrt TensorRT config Dict

The tensorrt configuration contains specification of the TensorRT engine and calibration requirements. +——————————+———————————————————————-+——————————-+——————————-+ | Field | Description | Data Type and Constraints | Recommended/Typical Value | +——————————+———————————————————————-+——————————-+——————————-+ | data_type | The precision to be used for the TensorRT engine | string | FP32 | +——————————+———————————————————————-+——————————-+——————————-+ | min_batch_size | The minimum batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | opt_batch_size | The optimal batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | max_batch_size | The maximum batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | max_workspace_size | The maximum workspace size for the TensorRT engine | unsigned int | 2 | +——————————+———————————————————————-+——————————-+——————————-+ | calibration | Calibration config | Dict | | +——————————+———————————————————————-+——————————-+——————————-+

The calibration configuration specifies the location of the calibration data and where to save the calibration cache file. +——————————+———————————————————————-+——————————-+——————————-+ | Field | Description | Data Type and Constraints | Recommended/Typical Value | +——————————+———————————————————————-+——————————-+——————————-+ | cal_image_dir | The directory containing images to be used for calibration | string | False | +——————————+———————————————————————-+——————————-+——————————-+ | cal_cache_file | The path to calibration cache file | string | False | +——————————+———————————————————————-+——————————-+——————————-+ | cal_batches | The number of batches to be iterated for calibration | unsigned int | 10 | +——————————+———————————————————————-+——————————-+——————————-+ | cal_batch_size | The batch size for each batch | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+

Below is a sample spec file for TF2 EfficientDet.

Copy Copied! dataset: augmentation: rand_hflip: True random_crop_min_scale: 0.1 random_crop_max_scale: 2 loader: prefetch_size: 4 shuffle_file: False shuffle_buffer: 10000 cycle_length: 32 block_length: 16 max_instances_per_image: 100 skip_crowd_during_training: True num_classes: 91 train_tfrecords: - '/data/train-*' val_tfrecords: - '/data/val-*' val_json_file: '/data/annotations/instances_val2017.json' train: optimizer: name: 'sgd' momentum: 0.9 lr_schedule: name: 'cosine' warmup_epoch: 5 warmup_init: 0.0001 learning_rate: 0.2 amp: True checkpoint: '' num_examples_per_epoch: 100 moving_average_decay: 0.999 batch_size: 20 checkpoint_interval: 5 l2_weight_decay: 0.00004 l1_weight_decay: 0.0 clip_gradients_norm: 10.0 image_preview: True qat: False random_seed: 42 pruned_model_path: '' num_epochs: 20 model: name: 'efficientdet-d0' input_width: 512 input_height: 512 aspect_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]' anchor_scale: 4 min_level: 3 max_level: 7 num_scales: 3 freeze_bn: False freeze_blocks: [] evaluate: batch_size: 8 num_samples: 500 max_detections_per_image: 100 label_map: "/data/coco_labels.yaml" trt_engine: "/output/efficientdet-d0.fp32.engine" checkpoint: '/weights/efficientdet-d0_100.tlt' export: batch_size: 1 dynamic_batch_size: True min_score_thresh: 0.4 checkpoint: '/weights/efficientdet-d0_100.tlt' onnx_file: "/output/efficientdet-d0.onnx" gen_trt_engine: onnx_file: "/output/efficientdet-d0.onnx" trt_engine: "/output/efficientdet-d0.fp32.engine" tensorrt: data_type: "fp32" max_workspace_size: 2 # in Gb calibration: cal_image_dir: "/data/raw-data/val2017" cal_cache_file: "EXPORTDIR/efficientdet-d0.cal" cal_batch_size: 16 cal_batches: 10 inference: checkpoint: '/weights/efficientdet-d0_100.tlt' trt_engine: "/output/efficientdet-d0.fp32.engine" image_dir: "/data/test_samples" dump_label: False batch_size: 1 min_score_thresh: 0.4 label_map: "/data/coco_labels.yaml" results_dir: '/results'

Use the following command to run TF2 EfficientDet engine generation:

Copy Copied! tao deploy efficientdet_tf2 gen_trt_engine -e /path/to/spec.yaml \ export.onnx_path=/path/to/onnx/file \ export.trt_engine=/path/to/engine/file \ export.tensorrt.data_type=<data_type>





-e, --experiment_spec : The experiment spec file to set up the TensorRT engine generation. This should be the same as the export specification file.

-h, --help : Show this help message and exit.

-k, --key : A user-specific encoding key to load a .etlt model.

-r, --results_dir : A global result directory where the experiment outputs and log would be written under <task> subdirectory.

Here’s an example of using the gen_trt_engine command to generate INT8 TensorRT engine:

Copy Copied! tao deploy efficientdet_tf2 gen_trt_engine -e $DEFAULT_SPEC export.onnx_path=$ETLT_FILE \ export.trt_engine=$ENGINE_FILE \ export.tensorrt.data_type=fp16



