TAO Toolkit v5.3.0
NVIDIA TAO v5.3.0

OCDNet with TAO Deploy

To generate an optimized TensorRT engine, the OCDNet .onnx file, which is generated from tao export, is taken as an input to tao-deploy. For more information about training an OCDNet model, refer to the OCDNet training documentation.

gen_trt_engine

The gen_trt_engine parameter in the experiment specification file provides options to generate a TensorRT engine from the .onnx file. The following is a sample config:

Copy
Copied!
            

gen_trt_engine: width: 1280 height: 736 img_mode: BGR onnx_file: '/results/export/model_best.onnx' trt_engine: /results/export/model_int8.engine tensorrt: data_type: int8 workspace_size: 20480 min_batch_size: 1 opt_batch_size: 1 max_batch_size: 1 layers_precision: [ "/backbone/patch_embed/stem/stem.0/Conv:fp32", "/backbone/patch_embed/stages.0/blocks/blocks.0/conv_dw/Conv:fp32", "/backbone/patch_embed/stages.0/blocks/blocks.0/norm/ReduceMean:fp32" ] calibration: cal_image_dir: /data/ocdnet_vit/train/img cal_cache_file: /results/export/cal.bin cal_batch_size: 8 cal_num_batches: 2

Parameter Datatype Default Description Supported Values
results_dir string – The path to the results directory –
onnx_file string – The path to the .onnx model –
trt_engine string – The absolute path to the generated TensorRT engine –
width unsigned int – The input width >0
height unsigned int – The input height >0
img_mode string BGR The input image mode BGR,RGB,GRAY

tensorrt

The tensorrt parameter defines the TensorRT engine generation.

Parameter Datatype Default Description Supported Values
data_type string fp32 The precision to use for the TensorRT engine fp32/fp16/int8
workspace_size unsigned int 1024 The maximum workspace size for the TensorRT engine >1024
min_batch_size unsigned int 1 The minimum batch size for the optimization profile shape >0
opt_batch_size unsigned int 1 The optimal batch size for the optimization profile shape >0
max_batch_size unsigned int 1 The maximum batch size for the optimization profile shape >0
layers_precision List – A list to specify layer precision layerName:[precision] precision: fp32/fp16/int8

calibration

The calibration parameter defines TensorRT engine generation with PTQ INT8 calibration.

Parameter Datatype Default Description Supported Values
cal_image_dir string list A list of paths that contain images used for calibration
cal_cache_file string The path to the calibration cache file to be dumped
cal_batch_size unsigned int 1 The batch size per batch during calibration >0
cal_num_batches unsigned int 1 The number of batches to calibrate >0

Use the following command to run OCDNet engine generation:

Copy
Copied!
            

tao deploy ocdnet gen_trt_engine -e /path/to/spec.yaml \ gen_trt_engine.onnx_file=/path/to/onnx/file \ gen_trt_engine.trt_engine=/path/to/engine/file \ gen_trt_engine.tensorrt.data_type=<data_type>

Required Arguments

  • -e, --experiment_spec_file: The path to the experiment spec file

  • gen_trt_engine.onnx_file: The .onnx model to be converted. This argument can be omitted if it is defined in spec file.

  • gen_trt_engine.trt_engine: The path where the generated engine will be stored. This argument can be omitted if it is defined in spec file.

  • gen_trt_engine.tensorrt.data_type: Precision to be exported. This argument can be omitted if it is defined in spec file.

Optional Arguments

  • -r, --results_dir: The directory where the status log JSON file will be dumped

Sample Usage

Here’s an example of using the gen_trt_engine command to generate an FP16 TensorRT engine:

Copy
Copied!
            

tao deploy ocdnet gen_trt_engine -e $SPECS_DIR/gen_trt_engine.yaml \ gen_trt_engine.onnx_file=$RESULTS_DIR/export/model_best.onnx \ gen_trt_engine.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine \ gen_trt_engine.tensorrt.data_type=fp16

Here’s an example of using the gen_trt_engine command to generate an INT8 TensorRT engine:

Copy
Copied!
            

tao deploy ocdnet gen_trt_engine -e $SPECS_DIR/gen_trt_engine.yaml \ gen_trt_engine.onnx_file=$RESULTS_DIR/export/model_best.onnx \ gen_trt_engine.tensorrt.min_batch_size=1 \ gen_trt_engine.tensorrt.opt_batch_size=3 \ gen_trt_engine.tensorrt.max_batch_size=3 \ gen_trt_engine.tensorrt.data_type=int8 \ gen_trt_engine.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine


The TAO Evaluation spec file can be reused for this step. The following is a sample spec file:

Copy
Copied!
            

model: load_pruned_graph: False pruned_graph_path: /results/prune/pruned_0.1.pth evaluate: results_dir: /results/evaluate checkpoint: /results/train/model_best.pth trt_engine: /results/export/ocdnet_model.engine gpu_id: 0 post_processing: type: SegDetectorRepresenter args: thresh: 0.3 box_thresh: 0.55 max_candidates: 1000 unclip_ratio: 1.5 metric: type: QuadMetric args: is_output_polygon: false dataset: validate_dataset: data_path: ['/data/ocdnet/test'] args: pre_processes: - type: Resize2D args: short_size: - 1280 - 736 resize_text_polys: true img_mode: BGR filter_keys: [] ignore_tags: ['*', '###'] loader: batch_size: 1 shuffle: false pin_memory: false num_workers: 4

Use the following command to run OCDNet engine evaluation:

Copy
Copied!
            

tao deploy ocdnet evaluate -e /path/to/spec.yaml \ -r /path/to/results \ evaluate.trt_engine=/path/to/engine/file

Required Arguments

  • -e, --experiment_spec: The experiment spec file for evaluation. This should be the same as the tao evaluate specification file.

  • evaluate.trt_engine: The engine file to run evaluation. It can be ignored if it is defined in the spec file.

Optional Arguments

  • -r, --results_dir: The directory where status logging JSON file and evaluation results will be dumped.

Sample Usage

Here’s an example of using the evaluate command to run evaluation with the TensorRT engine:

Copy
Copied!
            

tao deploy ocdnet evaluate -e $SPECS_DIR/evaluate.yaml \ -r $RESULTS_DIR \ evaluate.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine


You can reuse the TAO inference spec file for this step. The following is a sample spec file:

Copy
Copied!
            

inference: checkpoint: /results/train/model_best.pth trt_engine: /results/export/ocdnet_model.engine input_folder: /data/ocdnet/test/img width: 1280 height: 736 img_mode: BGR polygon: false results_dir: /results/inference post_processing: type: SegDetectorRepresenter args: thresh: 0.3 box_thresh: 0.55 max_candidates: 1000 unclip_ratio: 1.5

Use the following command to run OCDNet engine inference:

Copy
Copied!
            

tao deploy ocdnet inference -e /path/to/spec.yaml \ inference.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine \ inference.input_folder=$DATA_DIR/test/img \ inference.results_dir=$RESULTS_DIR/inference

Required Arguments

  • -e, --experiment_spec_file: The path to the experiment spec file

  • inference.trt_engine: The engine file to run inference. This argument can be omitted if it is defined in spec file.

  • inference.input_folder: The engine file to run inference. This argument can be omitted if it is defined in spec file.

  • inference.results_dir: The directory where status logging JSON file and inference results will be dumped.

    This argument can be omitted if it is defined in spec file.

Sample Usage

Here’s an example of using the inference command to run inference with the TensorRT engine:

Copy
Copied!
            

tao deploy ocdnet inference -e $SPECS_DIR/inference.yaml \ inference.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine \ inference.input_folder=$DATA_DIR/test/img \ inference.results_dir=$RESULTS_DIR/inference


Previous Multitask Image Classification with TAO Deploy
Next RetinaNet with TAO Deploy
© Copyright 2023, NVIDIA.. Last updated on Aug 26, 2024.