OCDNet with TAO Deploy#
To generate an optimized TensorRT engine for OCDNet, the gen_trt_engine action takes an
ONNX file previously produced by the OCDNet export action. For more information about
training an OCDNet model, refer to the OCDNet training documentation.
Converting onnx File into TensorRT Engine#
gen_trt_engine#
The gen_trt_engine parameter in the experiment specification file provides options to
generate a TensorRT engine from the ONNX file. The following is a sample config:
gen_trt_engine:
width: 1280
height: 736
img_mode: BGR
onnx_file: '/results/export/model_best.onnx'
trt_engine: /results/export/model_int8.engine
tensorrt:
data_type: int8
workspace_size: 20480
min_batch_size: 1
opt_batch_size: 1
max_batch_size: 1
layers_precision: [
"/backbone/patch_embed/stem/stem.0/Conv:fp32",
"/backbone/patch_embed/stages.0/blocks/blocks.0/conv_dw/Conv:fp32",
"/backbone/patch_embed/stages.0/blocks/blocks.0/norm/ReduceMean:fp32"
]
calibration:
cal_image_dir: /data/ocdnet_vit/train/img
cal_cache_file: /results/export/cal.bin
cal_batch_size: 8
cal_num_batches: 2
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
– |
The path to the results directory |
– |
|
string |
– |
The path to the |
– |
|
string |
– |
The absolute path to the generated TensorRT engine |
– |
|
unsigned int |
– |
The input width |
>0 |
|
unsigned int |
– |
The input height |
>0 |
|
string |
BGR |
The input image mode |
BGR,RGB,GRAY |
tensorrt#
The tensorrt parameter defines the TensorRT engine generation.
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
fp32 |
The precision to use for the TensorRT engine |
fp32/fp16/int8 |
|
unsigned int |
1024 |
The maximum workspace size for the TensorRT engine |
>1024 |
|
unsigned int |
1 |
The minimum batch size for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The optimal batch size for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The maximum batch size for the optimization profile shape |
>0 |
|
List |
– |
A list to specify layer precision |
layerName:[precision] precision: fp32/fp16/int8 |
calibration#
The calibration parameter defines TensorRT engine generation with PTQ INT8 calibration.
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string list |
A list of paths that contain images used for calibration |
||
|
string |
The path to the calibration cache file to be dumped |
||
|
unsigned int |
1 |
The batch size per batch during calibration |
>0 |
|
unsigned int |
1 |
The number of batches to calibrate |
>0 |
Ask the agent to run the gen_trt_engine action against your spec. For example:
Build an INT8 TensorRT engine for OCDNet from the exported ONNX at
``s3://my-bucket/ocdnet/model_best.onnx`` using ``trt-spec.yaml``.
Calibrate against ``s3://my-bucket/ocdnet/cal-images/`` and write the
engine to ``s3://my-bucket/ocdnet/ocdnet_model.engine``. Run on local Docker.
Running Evaluation through TensorRT Engine#
The TAO Evaluation specification file can be reused for this step. The following is a sample specification file:
model:
load_pruned_graph: False
pruned_graph_path: /results/prune/pruned_0.1.pth
evaluate:
results_dir: /results/evaluate
checkpoint: /results/train/model_best.pth
trt_engine: /results/export/ocdnet_model.engine
gpu_id: 0
post_processing:
type: SegDetectorRepresenter
args:
thresh: 0.3
box_thresh: 0.55
max_candidates: 1000
unclip_ratio: 1.5
metric:
type: QuadMetric
args:
is_output_polygon: false
dataset:
validate_dataset:
data_path: ['/data/ocdnet/test']
args:
pre_processes:
- type: Resize2D
args:
short_size:
- 1280
- 736
resize_text_polys: true
img_mode: BGR
filter_keys: []
ignore_tags: ['*', '###']
loader:
batch_size: 1
shuffle: false
pin_memory: false
num_workers: 4
Ask the agent to run the evaluate action against the engine you built. For example:
Evaluate the OCDNet TensorRT engine at
``s3://my-bucket/ocdnet/ocdnet_model.engine`` against ``eval-spec.yaml``.
Run on local Docker.
Running Inference through TensorRT Engine#
You can reuse the TAO inference specification file for this step. The following is a sample specification file:
inference:
checkpoint: /results/train/model_best.pth
trt_engine: /results/export/ocdnet_model.engine
input_folder: /data/ocdnet/test/img
width: 1280
height: 736
img_mode: BGR
polygon: false
results_dir: /results/inference
post_processing:
type: SegDetectorRepresenter
args:
thresh: 0.3
box_thresh: 0.55
max_candidates: 1000
unclip_ratio: 1.5
Ask the agent to run the inference action against the engine you built. For example:
Run OCDNet inference with the TensorRT engine at
``s3://my-bucket/ocdnet/ocdnet_model.engine`` against the test images at
``s3://my-bucket/ocdnet/test/img/`` using ``infer-spec.yaml``. Run on
your chosen backend.