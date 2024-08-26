NVIDIA TAO Toolkit v5.3.0
To generate an optimized TensorRT engine, a Deformable DETR ONNX file, which is first generated using tao model deformable_detr export, is taken as an input to tao deploy dino gen_trt_engine. For more information about training a Deformable DETR model, refer to the Deformable DETR training documentation.

Converting .onnx File into TensorRT Engine

To convert the .onnx file, you can reuse the spec file from the tao model deformable_detr export command.

gen_trt_engine

The gen_trt_engine parameter defines TensorRT engine generation.

gen_trt_engine:
  onnx_file: /path/to/onnx_file
  trt_engine: /path/to/trt_engine
  input_channel: 3
  input_width: 960
  input_height: 544
  tensorrt:
    data_type: int8
    workspace_size: 1024
    min_batch_size: 1
    opt_batch_size: 10
    max_batch_size: 10
    calibration:
      cal_image_dir:
        - /path/to/cal/images
      cal_cache_file: /path/to/cal.bin
      cal_batch_size: 10
      cal_batches: 1000

Parameter Datatype Default Description Supported Values
onnx_file string The precision to be used for the TensorRT engine
trt_engine string The maximum workspace size for the TensorRT engine
input_channel unsigned int 3 The input channel size. Only a value of 3 is supported. 3
input_width unsigned int 960 The input width >0
input_height unsigned int 544 The input height >0
batch_size unsigned int -1 The batch size of the ONNX model >=-1

tensorrt

The tensorrt parameter defines the TensorRT engine generation.

Parameter Datatype Default Description Supported Values
data_type string fp32 The precision to be used for the TensorRT engine fp32/fp16/int8
workspace_size unsigned int 1024 The maximum workspace size for the TensorRT engine >1024
min_batch_size unsigned int 1 The minimum batch size used for the optimization profile shape >0
opt_batch_size unsigned int 1 The optimal batch size used for the optimization profile shape >0
max_batch_size unsigned int 1 The maximum batch size used for the optimization profile shape >0

calibration

The calibration parameter defines the TensorRT engine generation with PTQ INT8 calibration.

Parameter Datatype Default Description Supported Values
cal_image_dir string list The list of paths that contain images used for calibration
cal_cache_file string The path to calibration cache file to be dumped
cal_batch_size unsigned int 1 The batch size per batch during calibration >0
cal_batches unsigned int 1 The number of batches to calibrate >0

Use the following command to run Deformable DETR engine generation:

tao deploy deformable_detr gen_trt_engine -e /path/to/spec.yaml \
           -r /path/to/results \
           gen_trt_engine.onnx_file=/path/to/onnx/file \
           gen_trt_engine.trt_engine=/path/to/engine/file \
           gen_trt_engine.tensorrt.data_type=<data_type>

Required Arguments

  • -e, --experiment_spec: The experiment spec file to set up the TensorRT engine generation

Optional Arguments

  • -r, --results_dir: The directory where the JSON status-log file will be dumped

  • gen_trt_engine.onnx_file: The .onnx model to be converted

  • gen_trt_engine.trt_engine: The path where the generated engine will be stored

  • gen_trt_engine.tensorrt.data_type: The precision to be exported

Sample Usage

Here’s an example of using the gen_trt_engine command to generate FP16 TensorRT engine:

tao deploy deformable_detr gen_trt_engine -e $DEFAULT_SPEC
           gen_trt_engine.onnx_file=$ONNX_FILE \
           gen_trt_engine.trt_engine=$ENGINE_FILE \
           gen_trt_engine.tensorrt.data_type=FP16


Running Evaluation through TensorRT Engine

You can reuse the TAO evaluation spec file for evaluation through a TensorRT engine. The following is a sample spec file:

evaluate:
  trt_engine: /path/to/engine/file
  conf_threshold: 0.0
  input_width: 960
  input_height: 544
dataset:
  test_data_sources:
    image_dir: /data/raw-data/val2017/
    json_file: /data/raw-data/annotations/instances_val2017.json
  num_classes: 91
  batch_size: 8

Use the following command to run Deformable DETR engine evaluation:

tao deploy deformable_detr evaluate -e /path/to/spec.yaml \
           -r /path/to/results \
           evaluate.trt_engine=/path/to/engine/file

Required Arguments

  • -e, --experiment_spec: The experiment spec file for evaluation. This should be the same as the tao evaluate specification file.

Optional Arguments

  • -r, --results_dir: The directory where the JSON status-log file and evaluation results will be dumped

  • evaluate.trt_engine: The engine file to run evaluation

Sample Usage

Here’s an example of using the evaluate command to run evaluation with the TensorRT engine:

tao deploy deformable_detr evaluate -e $DEFAULT_SPEC
           -r $RESULTS_DIR \
           evaluate.trt_engine=$ENGINE_FILE


Running Inference through TensorRT Engine

You can reuse the TAO inference spec file for inference through a TensorRT engine. The following is a sample spec file:

inference:
  conf_threshold: 0.5
  input_width: 960
  input_height: 544
  trt_engine: /path/to/engine/file
  color_map:
    person: green
    car: red
    cat: blue
dataset:
  infer_data_sources:
    image_dir: /data/raw-data/val2017/
    classmap: /path/to/coco/annotations/coco_classmap.txt
  num_classes: 91
  batch_size: 8

Use the following command to run Deformable DETR engine inference:

tao deploy deformable_detr inference -e /path/to/spec.yaml \
           -r /path/to/results \
           inference.trt_engine=/path/to/engine/file

Required Arguments

  • -e, --experiment_spec: The experiment spec file for inference. This should be the same as the tao inference specification file.

Optional Arguments

  • -r, --results_dir: The directory where the JSON status-log file and inference results will be dumped.

  • inference.trt_engine: The engine file to run inference

Sample Usage

Here’s an example of using the inference command to run inference with the TensorRT engine:

tao deploy deformable_detr inference -e $DEFAULT_SPEC
           -r $RESULTS_DIR \
           evaluate.trt_engine=$ENGINE_FILE

The visualization will be stored under $RESULTS_DIR/images_annotated, and KITTI format predictions will be stored under $RESULTS_DIR/labels.
