MAE with TAO Deploy#
To generate an optimized TensorRT engine, a MAE .onnx
file, which is first generated using tao model mae export
,
is taken as an input to tao deploy mae gen_trt_engine
. For more information about training a MAE model,
refer to the MAE training documentation.
Each task is explained in detail in the following sections.
Note
Throughout this documentation, you will see references to
$EXPERIMENT_ID
and$DATASET_ID
in the FTMS Client sections.For instructions on creating a dataset using the remote client, see the Creating a dataset section in the Remote Client documentation.
For instructions on creating an experiment using the remote client, see the Creating an experiment section in the Remote Client documentation.
The spec format is YAML for TAO Launcher and JSON for FTMS Client.
File-related parameters, such as dataset paths or pretrained model paths, are required only for TAO Launcher and not for FTMS Client.
Converting MAE .onnx File into TensorRT Engine#
To convert the .onnx
file, you can reuse the spec file from the Exporting the model.
gen_trt_engine#
The gen_trt_engine
parameter defines TensorRT engine generation.
SPECS=$(tao-client mae get-spec --action gen_trt_engine --id $EXPERIMENT_ID)
gen_trt_engine:
onnx_file: /path/to/onnx_file
trt_engine: /path/to/trt_engine
input_channel: 3
input_width: 224
input_height: 224
tensorrt:
data_type: fp16
workspace_size: 1024
min_batch_size: 1
opt_batch_size: 16
max_batch_size: 16
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
The precision to be used for the TensorRT engine. |
||
|
string |
The maximum workspace size for the TensorRT engine. |
||
|
unsigned int |
3 |
The input channel size. Only the value 3 is supported. |
3 |
|
unsigned int |
224 |
The input width. |
>0 |
|
unsigned int |
224 |
The input height. |
>0 |
|
unsigned int |
-1 |
The batch size of the ONNX model. |
>=-1 |
|
bool |
False |
Enables verbosity for the TensorRT log. |
tensorrt#
The tensorrt
parameter defines TensorRT engine generation.
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
fp32 |
The precision to be used for the TensorRT engine |
fp32/fp16/int8 |
|
unsigned int |
1024 |
The maximum workspace size for the TensorRT engine |
>1024 |
|
unsigned int |
1 |
The minimum batch size used for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The optimal batch size used for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The maximum batch size used for the optimization profile shape |
>0 |
Use the following command to run classification (PyTorch) engine generation:
GEN_TRT_ENGINE_JOB_ID=$(tao-client mae experiment-run-action --action gen_trt_engine --id $EXPERIMENT_ID --specs "$SPECS" --parent_job_id $EXPORT_JOB_ID)
Note
$EXPORT_JOB_ID is the job ID of the Exporting the model section.
tao deploy mae gen_trt_engine -e /path/to/spec.yaml \
results_dir=/path/to/results \
gen_trt_engine.onnx_file=/path/to/onnx/file \
gen_trt_engine.trt_engine=/path/to/engine/file \
gen_trt_engine.tensorrt.data_type=<data_type>
Required Arguments
-e, --experiment_spec
: The experiment spec file to set up TensorRT engine generation
Optional Arguments
results_dir
: The directory where the JSON status-log file will be dumpedgen_trt_engine.onnx_file
: The.onnx
model to be convertedgen_trt_engine.trt_engine
: The path where the generated engine will be storedgen_trt_engine.tensorrt.data_type
: The precision to be exported
Sample Usage
Here’s an example of using the gen_trt_engine
command to generate an FP16 TensorRT engine:
tao deploy mae gen_trt_engine -e $DEFAULT_SPEC
gen_trt_engine.onnx_file=$ONNX_FILE \
gen_trt_engine.trt_engine=$ENGINE_FILE \
gen_trt_engine.tensorrt.data_type=FP16