MLRecogNet with TAO Deploy#
To generate an optimized TensorRT engine for MLRecogNet, the gen_trt_engine action takes an
ONNX file previously produced by the MLRecogNet export action. MLRecogNet supports FP32,
FP16, and INT8 data types.
For more information about training an MLRecogNet model, refer to the MLRecogNet training documentation.
Each task is explained in detail in the following sections.
Converting ONNX File into TensorRT Engine#
Here is an example spec $TRT_GEN_SPEC for generating TensorRT engine from the exported MLRecogNet onnx model.
trt_config#
The trt_config parameter provides options related to TensorRT generation.
results_dir: /path/to/results/dir
dataset:
val_dataset:
reference: /path/to/reference/set
query: /path/to/query/set
pixel_mean: [0.485, 0.456, 0.406]
pixel_std: [0.226, 0.226, 0.226]
model:
input_channel: 3
input_width: 224
input_height: 224
gen_trt_engine:
gpu_id: 0
onnx_file: /path/to/exported/onnx/file
trt_engine: /path/to/trt/engine/to/generate
tensorrt:
data_type: int8
workspace_size: 1024
min_batch_size: 1
opt_batch_size: 10
max_batch_size: 10
calibration:
cal_cache_file: /path/to/calibration/cache/file/to/generate
cal_batch_size: 16
cal_batches: 100
cal_image_dir:
- /path/to/calibration/image/folder
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
FP32 |
The precision to be used for the TensorRT engine |
FP32/FP16/INT8 |
|
unsigned int |
1024 |
The maximum workspace size for the TensorRT engine |
>1024 |
|
unsigned int |
1 |
The minimum batch size for optimization profile shape |
>0 |
|
unsigned int |
1 |
The optimal batch size for optimization profile shape |
>0 |
|
unsigned int |
1 |
The maximum batch size for optimization profile shape |
>0 |
|
dict config |
None |
The configuration for the INT8 calibration |
Calibration Config#
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
None |
The path to calibration cache file. If there’s no calibration cache file at this path, a cache file is generated based on the the other |
|
|
unsigned int |
1 |
the batch size of calibration dataset |
>0 |
|
unsigned int |
1 |
The number of batches used for calibration. In total, there are |
>0 |
|
string |
None |
The directory containing the calibration images |
Ask the agent to run the gen_trt_engine action against your spec. For example:
Build an FP16 TensorRT engine for MLRecogNet from the exported ONNX at
``s3://my-bucket/mlrecog/model.onnx`` using ``trt-spec.yaml``. Write the
engine to ``s3://my-bucket/mlrecog/model.engine``. Run on the local Docker daemon.
A successful run writes a status.json with a SUCCESS message to the configured results
directory.
Running Evaluation through TensorRT Engine#
Use the same specification file as the TAO evaluation specification file. The following is a sample specification file:
evaluate:
trt_engine: /path/to/generated/trt_engine
batch_size: 8
topk: 5
dataset:
val_dataset:
reference: /path/to/reference/set
query: /path/to/query/set
Ask the agent to run the evaluate action against the engine you built. For example:
Evaluate the MLRecogNet TensorRT engine at
``s3://my-bucket/mlrecog/model.engine`` against ``eval-spec.yaml``. Run
on local Docker.
A successful run writes Top-K accuracy, a confusion matrix, and a classification report to the configured results directory.
Running Inference through TensorRT Engine#
Use the same specification file as the TAO inference specification file. The following is a sample specification file:
results_dir: "/path/to/output_dir"
model:
input_channels: 3
input_width: 224
input_height: 224
inference:
trt_engine: "/path/to/generated/trt_engine"
batch_size: 10
inference_input_type: classification_folder
topk: 5
dataset:
val_dataset:
reference: "/path/to/reference/set"
query: ""
Ask the agent to run the inference action against the engine you built. For example:
Run MLRecogNet inference with the TensorRT engine at
``s3://my-bucket/mlrecog/model.engine`` using ``infer-spec.yaml``. Run on
your chosen backend.
JSON-format results are written to trt_inference under the configured results directory.