Classification (TF2) with TAO Deploy#

To generate an optimized TensorRT engine for TF2 Classification, the gen_trt_engine action takes an ONNX file previously produced by the TF2 Classification export action. For more information about training the TF2 Classification, refer to the TF2 Classification training documentation.

Converting ONNX File into TensorRT Engine#

You can reuse the spec from the TF2 Classification export action as a starting point.

GenTrtEngine Config#

The gen_trt_engine configuration contains the parameters of exporting a .onnx model to TensorRT engine, which can be used for deployment.

Field

Description

Data Type and Constraints

Recommended/Typical Value

onnx_file

The path to the exported .onnx model

string

trt_engine

The path where the generated engine will be stored

string

results_dir

Directory to save the output log. If not specified log will be saved under global $results_dir/gen_trt_engine

string

tensorrt

TensorRT config

Dict

The tensorrt configuration contains specification of the TensorRT engine and calibration requirements. +——————————+———————————————————————-+——————————-+——————————-+ | Field | Description | Data Type and Constraints | Recommended/Typical Value | +——————————+———————————————————————-+——————————-+——————————-+ | data_type | The precision to be used for the TensorRT engine | string | FP32 | +——————————+———————————————————————-+——————————-+——————————-+ | min_batch_size | The minimum batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | opt_batch_size | The optimal batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | max_batch_size | The maximum batch size used for optimization profile shape | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | max_workspace_size | The maximum workspace size for the TensorRT engine | unsigned int | 2 | +——————————+———————————————————————-+——————————-+——————————-+ | calibration | Calibration config | Dict | | +——————————+———————————————————————-+——————————-+——————————-+

The calibration configuration specifies the location of the calibration data and where to save the calibration cache file. +——————————+———————————————————————-+——————————-+——————————-+ | Field | Description | Data Type and Constraints | Recommended/Typical Value | +——————————+———————————————————————-+——————————-+——————————-+ | cal_image_dir | The directory containing images to be used for calibration | string | | +——————————+———————————————————————-+——————————-+——————————-+ | cal_cache_file | The path to calibration cache file | string | | +——————————+———————————————————————-+——————————-+——————————-+ | cal_batches | The number of batches to be iterated for calibration | unsigned int | 10 | +——————————+———————————————————————-+——————————-+——————————-+ | cal_batch_size | The batch size for each batch | unsigned int | 1 | +——————————+———————————————————————-+——————————-+——————————-+ | cal_data_file | The path to calibration data file | string | | +——————————+———————————————————————-+——————————-+——————————-+

The following is a sample specification file for TF2 classification:

results_dir: '/results'
dataset:
  num_classes: 20
  train_dataset_path: "/workspace/tao-experiments/data/split/train"
  val_dataset_path: "/workspace/tao-experiments/data/split/val"
  preprocess_mode: 'torch'
  augmentation:
    enable_color_augmentation: True
    enable_center_crop: True
train:
  qat: False
  checkpoint: ''
  batch_size_per_gpu: 64
  num_epochs: 80
  optim_config:
    optimizer: 'sgd'
  lr_config:
    scheduler: 'cosine'
    learning_rate: 0.05
    soft_start: 0.05
  reg_config:
    type: 'L2'
    scope: ['conv2d', 'dense']
    weight_decay: 0.00005
model:
  backbone: 'efficientnet-b0'
  input_width: 256
  input_height: 256
  input_channels: 3
  input_image_depth: 8
evaluate:
  dataset_path: '/workspace/tao-experiments/data/split/test'
  checkpoint: ''
  trt_engine: '/results/efficientnet-b0.fp32.engine'
  top_k: 3
  batch_size: 256
  n_workers: 8
inference:
  checkpoint: ''
  trt_engine: '/results/efficientnet-b0.fp32.engine'
  image_dir: '/workspace/tao-experiments/data/split/test/aeroplane'
  classmap: '/results/train/classmap.json'
export:
  checkpoint: ''
  onnx_file: '/results/efficientnet-b0.onnx'
gen_trt_engine:
  onnx_file: '/results/efficientnet-b0.onnx'
  trt_engine: '/results/efficientnet-b0.fp32.engine'
  tensorrt:
    data_type: "fp32"
    max_workspace_size: 4
    max_batch_size: 16
    calibration:
      cal_image_dir: '/workspace/tao-experiments/data/split/test'
      cal_data_file: '/results/calib.tensorfile'
      cal_cache_file: '/results/cal.bin'
      cal_batches: 10

Ask the agent to run the gen_trt_engine action against your spec. For example:

Build an FP16 TensorRT engine for TF2 Classification from the exported
ONNX at ``s3://my-bucket/cls-tf2/model.onnx`` using ``trt-spec.yaml``.
Write the engine to ``s3://my-bucket/cls-tf2/model.engine``. Run on the
local Docker backend.

Running Evaluation through TensorRT Engine#

Use the same specification file as the TAO evaluation specification file.

Ask the agent to run the evaluate action against the engine you built. For example:

Evaluate the TF2 Classification TensorRT engine at
``s3://my-bucket/cls-tf2/model.engine`` against ``eval-spec.yaml``. Run on
local Docker.

Running Inference through TensorRT Engine#

Use the same specification file as the TAO inference specification file.

Ask the agent to run the inference action against the engine you built. For example:

Run TF2 Classification inference with the TensorRT engine at
``s3://my-bucket/cls-tf2/model.engine`` using ``infer-spec.yaml``. Run on
your chosen backend.

CSV predictions are written to result.csv under the configured results directory.