OCDNet with TAO Deploy#

To generate an optimized TensorRT engine for OCDNet, the gen_trt_engine action takes an ONNX file previously produced by the OCDNet export action. For more information about training an OCDNet model, refer to the OCDNet training documentation.

Converting onnx File into TensorRT Engine#

gen_trt_engine#

The gen_trt_engine parameter in the experiment specification file provides options to generate a TensorRT engine from the ONNX file. The following is a sample config:

gen_trt_engine:
  width: 1280
  height: 736
  img_mode: BGR
  onnx_file: '/results/export/model_best.onnx'
  trt_engine: /results/export/model_int8.engine
  tensorrt:
    data_type: int8
    workspace_size: 20480
    min_batch_size: 1
    opt_batch_size: 1
    max_batch_size: 1
    layers_precision: [
      "/backbone/patch_embed/stem/stem.0/Conv:fp32",
      "/backbone/patch_embed/stages.0/blocks/blocks.0/conv_dw/Conv:fp32",
      "/backbone/patch_embed/stages.0/blocks/blocks.0/norm/ReduceMean:fp32"
    ]
    calibration:
      cal_image_dir: /data/ocdnet_vit/train/img
      cal_cache_file: /results/export/cal.bin
      cal_batch_size: 8
      cal_num_batches: 2

Parameter

Datatype

Default

Description

Supported Values

results_dir

string

The path to the results directory

onnx_file

string

The path to the .onnx model

trt_engine

string

The absolute path to the generated TensorRT engine

width

unsigned int

The input width

>0

height

unsigned int

The input height

>0

img_mode

string

BGR

The input image mode

BGR,RGB,GRAY

tensorrt#

The tensorrt parameter defines the TensorRT engine generation.

Parameter

Datatype

Default

Description

Supported Values

data_type

string

fp32

The precision to use for the TensorRT engine

fp32/fp16/int8

workspace_size

unsigned int

1024

The maximum workspace size for the TensorRT engine

>1024

min_batch_size

unsigned int

1

The minimum batch size for the optimization profile shape

>0

opt_batch_size

unsigned int

1

The optimal batch size for the optimization profile shape

>0

max_batch_size

unsigned int

1

The maximum batch size for the optimization profile shape

>0

layers_precision

List

A list to specify layer precision

layerName:[precision] precision: fp32/fp16/int8

calibration#

The calibration parameter defines TensorRT engine generation with PTQ INT8 calibration.

Parameter

Datatype

Default

Description

Supported Values

cal_image_dir

string list

A list of paths that contain images used for calibration

cal_cache_file

string

The path to the calibration cache file to be dumped

cal_batch_size

unsigned int

1

The batch size per batch during calibration

>0

cal_num_batches

unsigned int

1

The number of batches to calibrate

>0

Ask the agent to run the gen_trt_engine action against your spec. For example:

Build an INT8 TensorRT engine for OCDNet from the exported ONNX at
``s3://my-bucket/ocdnet/model_best.onnx`` using ``trt-spec.yaml``.
Calibrate against ``s3://my-bucket/ocdnet/cal-images/`` and write the
engine to ``s3://my-bucket/ocdnet/ocdnet_model.engine``. Run on local Docker.

Running Evaluation through TensorRT Engine#

The TAO Evaluation specification file can be reused for this step. The following is a sample specification file:

model:
  load_pruned_graph: False
  pruned_graph_path: /results/prune/pruned_0.1.pth
evaluate:
  results_dir: /results/evaluate
  checkpoint: /results/train/model_best.pth
  trt_engine: /results/export/ocdnet_model.engine
  gpu_id: 0
  post_processing:
    type: SegDetectorRepresenter
    args:
      thresh: 0.3
      box_thresh: 0.55
      max_candidates: 1000
      unclip_ratio: 1.5
  metric:
    type: QuadMetric
    args:
      is_output_polygon: false
dataset:
  validate_dataset:
    data_path: ['/data/ocdnet/test']
    args:
      pre_processes:
        - type: Resize2D
          args:
            short_size:
              - 1280
              - 736
            resize_text_polys: true
      img_mode: BGR
      filter_keys: []
      ignore_tags: ['*', '###']
    loader:
      batch_size: 1
      shuffle: false
      pin_memory: false
      num_workers: 4

Ask the agent to run the evaluate action against the engine you built. For example:

Evaluate the OCDNet TensorRT engine at
``s3://my-bucket/ocdnet/ocdnet_model.engine`` against ``eval-spec.yaml``.
Run on local Docker.

Running Inference through TensorRT Engine#

You can reuse the TAO inference specification file for this step. The following is a sample specification file:

inference:
  checkpoint: /results/train/model_best.pth
  trt_engine: /results/export/ocdnet_model.engine
  input_folder: /data/ocdnet/test/img
  width: 1280
  height: 736
  img_mode: BGR
  polygon: false
  results_dir: /results/inference
  post_processing:
    type: SegDetectorRepresenter
    args:
      thresh: 0.3
      box_thresh: 0.55
      max_candidates: 1000
      unclip_ratio: 1.5

Ask the agent to run the inference action against the engine you built. For example:

Run OCDNet inference with the TensorRT engine at
``s3://my-bucket/ocdnet/ocdnet_model.engine`` against the test images at
``s3://my-bucket/ocdnet/test/img/`` using ``infer-spec.yaml``. Run on
your chosen backend.