RT-DETR with TAO Deploy#
To generate an optimized TensorRT engine, a RT-DETR ONNX file, which is first generated using tao model rtdetr export
,
is taken as an input to tao deploy rtdetr gen_trt_engine
. For more information about training a RT-DETR model,
refer to the RT-DETR training documentation.
Note
Throughout this documentation, you will see references to
$EXPERIMENT_ID
and$DATASET_ID
in the FTMS Client sections.For instructions on creating a dataset using the remote client, see the Creating a dataset section in the Remote Client documentation.
For instructions on creating an experiment using the remote client, see the Creating an experiment section in the Remote Client documentation.
The spec format is YAML for TAO Launcher and JSON for FTMS Client.
File-related parameters, such as dataset paths or pretrained model paths, are required only for TAO Launcher and not for FTMS Client.
Converting RT-DETR .onnx File into TensorRT Engine#
To convert the .onnx
file, you can reuse the spec file from the tao model rtdetr export
command.
gen_trt_engine#
The gen_trt_engine
parameter defines TensorRT engine generation.
Use the following command to get an experiment spec file for ReIdentificationNet:
SPECS=$(tao-client rtdetr get-spec --action train --job_type experiment --id $EXPERIMENT_ID)
gen_trt_engine:
onnx_file: /path/to/onnx_file
trt_engine: /path/to/trt_engine
input_channel: 3
input_width: 640
input_height: 640
tensorrt:
data_type: int8
workspace_size: 1024
min_batch_size: 1
opt_batch_size: 10
max_batch_size: 10
calibration:
cal_image_dir:
- /path/to/cal/images
cal_cache_file: /path/to/cal.bin
cal_batch_size: 10
cal_batches: 1000
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
The precision to be used for the TensorRT engine |
||
|
string |
The maximum workspace size for the TensorRT engine |
||
|
unsigned int |
3 |
The input channel size. Only a value of 3 is supported. |
3 |
|
unsigned int |
960 |
The input width |
>0 |
|
unsigned int |
544 |
The input height |
>0 |
|
unsigned int |
-1 |
The batch size of the ONNX model |
>=-1 |
tensorrt#
The tensorrt
parameter defines the TensorRT engine generation.
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
fp32 |
The precision to be used for the TensorRT engine |
fp32/fp16/int8 |
|
unsigned int |
1024 |
The maximum workspace size for the TensorRT engine |
>1024 |
|
unsigned int |
1 |
The minimum batch size used for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The optimal batch size used for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The maximum batch size used for the optimization profile shape |
>0 |
calibration#
The calibration
parameter defines the TensorRT engine generation with PTQ INT8 calibration.
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string list |
The list of paths that contain images used for calibration |
||
|
string |
The path to calibration cache file to be dumped |
||
|
unsigned int |
1 |
The batch size per batch during calibration |
>0 |
|
unsigned int |
1 |
The number of batches to calibrate |
>0 |
Note
For RT-DETR, int8 calibration is only supported for the ResNet series of backbones.
Use the following command to run RT-DETR engine generation:
GTE_JOB_ID=$(tao-client rtdetr experiment-run-action --action gen_trt_engine --id $EXPERIMENT_ID --parent_job_id $EXPORT_JOB_ID --specs "$SPECS")
See also
The Export job ID is the job ID of the tao-client rtdetr experiment-run-action --action export
command.
tao deploy rtdetr gen_trt_engine -e /path/to/spec.yaml \ results_dir=/path/to/results \ gen_trt_engine.onnx_file=/path/to/onnx/file \ gen_trt_engine.trt_engine=/path/to/engine/file \ gen_trt_engine.tensorrt.data_type=<data_type>
Required Arguments
-e, --experiment_spec
: The experiment spec file to set up the TensorRT engine generation
Optional Arguments
results_dir
: The directory where the JSON status-log file will be dumpedgen_trt_engine.onnx_file
: The.onnx
model to be convertedgen_trt_engine.trt_engine
: The path where the generated engine will be storedgen_trt_engine.tensorrt.data_type
: The precision to be exported
Sample Usage
Here’s an example of using the gen_trt_engine
command to generate FP16 TensorRT engine:
tao deploy rtdetr gen_trt_engine -e $DEFAULT_SPEC
gen_trt_engine.onnx_file=$ONNX_FILE \
gen_trt_engine.trt_engine=$ENGINE_FILE \
gen_trt_engine.tensorrt.data_type=FP16
Running Evaluation through TensorRT Engine#
You can reuse the TAO evaluation spec file for evaluation through a TensorRT engine. The following is a sample spec file:
evaluate:
trt_engine: /path/to/engine/file
conf_threshold: 0.0
input_width: 640
input_height: 640
dataset:
test_data_sources:
image_dir: /data/raw-data/val2017/
json_file: /data/raw-data/annotations/instances_val2017.json
num_classes: 80
batch_size: 8
Use the following command to run RT-DETR engine evaluation:
EVAL_JOB_ID=$(tao-client rtdetr experiment-run-action --action evaluate --id $EXPERIMENT_ID --parent_job_id $GTE_JOB_ID --specs "$SPECS")
tao deploy rtdetr evaluate -e /path/to/spec.yaml \ results_dir=/path/to/results \ evaluate.trt_engine=/path/to/engine/file
Required Arguments
-e, --experiment_spec
: The experiment spec file for evaluation. This should be the same as thetao evaluate
specification file.
Optional Arguments
results_dir
: The directory where the JSON status-log file and evaluation results will be dumpedevaluate.trt_engine
: The engine file to run evaluation
Sample Usage
Here’s an example of using the evaluate
command to run evaluation with the TensorRT engine:
tao deploy rtdetr evaluate -e $DEFAULT_SPEC
results_dir=$RESULTS_DIR \
evaluate.trt_engine=$ENGINE_FILE
Running Inference through TensorRT Engine#
You can reuse the TAO inference spec file for inference through a TensorRT engine. The following is a sample spec file:
inference:
conf_threshold: 0.5
input_width: 640
input_height: 640
trt_engine: /path/to/engine/file
color_map:
person: green
car: red
cat: blue
dataset:
infer_data_sources:
image_dir: ["/data/raw-data/val2017/"]
classmap: /path/to/coco/annotations/coco_classmap.txt
num_classes: 80
batch_size: 8
Use the following command to run RT-DETR engine inference:
EVAL_JOB_ID=$(tao-client rtdetr experiment-run-action --action inference --id $EXPERIMENT_ID --parent_job_id $GTE_JOB_ID --specs "$SPECS")
tao deploy rtdetr inference -e /path/to/spec.yaml \ -r /path/to/results \ inference.trt_engine=/path/to/engine/file
Required Arguments
-e, --experiment_spec
: The experiment spec file for inference. This should be the same as thetao inference
specification file.
Optional Arguments
results_dir
: The directory where the JSON status-log file and inference results will be dumped.inference.trt_engine
: The engine file to run inference
Sample Usage
Here’s an example of using the inference
command to run inference with the TensorRT engine:
tao deploy rtdetr inference -e $DEFAULT_SPEC
results_dir=$RESULTS_DIR \
evaluate.trt_engine=$ENGINE_FILE
The visualization will be stored under $RESULTS_DIR/images_annotated
, and KITTI format predictions will be stored under $RESULTS_DIR/labels
.