MAE with TAO Deploy#

To generate an optimized TensorRT engine, a MAE .onnx file, which is first generated using tao model mae export, is taken as an input to tao deploy mae gen_trt_engine. For more information about training a MAE model, refer to the MAE training documentation.

Each task is explained in detail in the following sections.

Note

  • Throughout this documentation, you will see references to $EXPERIMENT_ID and $DATASET_ID in the FTMS Client sections.

    • For instructions on creating a dataset using the remote client, see the Creating a dataset section in the Remote Client documentation.

    • For instructions on creating an experiment using the remote client, see the Creating an experiment section in the Remote Client documentation.

  • The spec format is YAML for TAO Launcher and JSON for FTMS Client.

  • File-related parameters, such as dataset paths or pretrained model paths, are required only for TAO Launcher and not for FTMS Client.

Converting MAE .onnx File into TensorRT Engine#

To convert the .onnx file, you can reuse the spec file from the Exporting the model.

gen_trt_engine#

The gen_trt_engine parameter defines TensorRT engine generation.

SPECS=$(tao-client mae get-spec --action gen_trt_engine --id $EXPERIMENT_ID)

Parameter

Datatype

Default

Description

Supported Values

onnx_file

string

The precision to be used for the TensorRT engine.

trt_engine

string

The maximum workspace size for the TensorRT engine.

input_channel

unsigned int

3

The input channel size. Only the value 3 is supported.

3

input_width

unsigned int

224

The input width.

>0

input_height

unsigned int

224

The input height.

>0

batch_size

unsigned int

-1

The batch size of the ONNX model.

>=-1

verbose

bool

False

Enables verbosity for the TensorRT log.

tensorrt#

The tensorrt parameter defines TensorRT engine generation.

Parameter

Datatype

Default

Description

Supported Values

data_type

string

fp32

The precision to be used for the TensorRT engine

fp32/fp16/int8

workspace_size

unsigned int

1024

The maximum workspace size for the TensorRT engine

>1024

min_batch_size

unsigned int

1

The minimum batch size used for the optimization profile shape

>0

opt_batch_size

unsigned int

1

The optimal batch size used for the optimization profile shape

>0

max_batch_size

unsigned int

1

The maximum batch size used for the optimization profile shape

>0

Use the following command to run classification (PyTorch) engine generation:

GEN_TRT_ENGINE_JOB_ID=$(tao-client mae experiment-run-action --action gen_trt_engine --id $EXPERIMENT_ID --specs "$SPECS" --parent_job_id $EXPORT_JOB_ID)

Note

$EXPORT_JOB_ID is the job ID of the Exporting the model section.