Classification (TF2) with TAO Deploy#

To generate an optimized TensorRT engine for TF2 Classification, the gen_trt_engine action takes an ONNX file previously produced by the TF2 Classification export action. For more information about training the TF2 Classification, refer to the TF2 Classification training documentation.

Converting ONNX File into TensorRT Engine#

You can reuse the spec from the TF2 Classification export action as a starting point.

GenTrtEngine Config#

The gen_trt_engine configuration contains the parameters of exporting a .onnx model to TensorRT engine, which can be used for deployment.

Field	Description	Data Type and Constraints	Recommended/Typical Value
onnx_file	The path to the exported .onnx model	string
trt_engine	The path where the generated engine will be stored	string
results_dir	Directory to save the output log. If not specified log will be saved under global $results_dir/gen_trt_engine	string
tensorrt	TensorRT config	Dict

The following is a sample specification file for TF2 classification:

results_dir: '/results'
dataset:
  num_classes: 20
  train_dataset_path: "/workspace/tao-experiments/data/split/train"
  val_dataset_path: "/workspace/tao-experiments/data/split/val"
  preprocess_mode: 'torch'
  augmentation:
    enable_color_augmentation: True
    enable_center_crop: True
train:
  qat: False
  checkpoint: ''
  batch_size_per_gpu: 64
  num_epochs: 80
  optim_config:
    optimizer: 'sgd'
  lr_config:
    scheduler: 'cosine'
    learning_rate: 0.05
    soft_start: 0.05
  reg_config:
    type: 'L2'
    scope: ['conv2d', 'dense']
    weight_decay: 0.00005
model:
  backbone: 'efficientnet-b0'
  input_width: 256
  input_height: 256
  input_channels: 3
  input_image_depth: 8
evaluate:
  dataset_path: '/workspace/tao-experiments/data/split/test'
  checkpoint: ''
  trt_engine: '/results/efficientnet-b0.fp32.engine'
  top_k: 3
  batch_size: 256
  n_workers: 8
inference:
  checkpoint: ''
  trt_engine: '/results/efficientnet-b0.fp32.engine'
  image_dir: '/workspace/tao-experiments/data/split/test/aeroplane'
  classmap: '/results/train/classmap.json'
export:
  checkpoint: ''
  onnx_file: '/results/efficientnet-b0.onnx'
gen_trt_engine:
  onnx_file: '/results/efficientnet-b0.onnx'
  trt_engine: '/results/efficientnet-b0.fp32.engine'
  tensorrt:
    data_type: "fp32"
    max_workspace_size: 4
    max_batch_size: 16
    calibration:
      cal_image_dir: '/workspace/tao-experiments/data/split/test'
      cal_data_file: '/results/calib.tensorfile'
      cal_cache_file: '/results/cal.bin'
      cal_batches: 10

Ask the agent to run the gen_trt_engine action against your spec. For example:

Build an FP16 TensorRT engine for TF2 Classification from the exported
ONNX at ``s3://my-bucket/cls-tf2/model.onnx`` using ``trt-spec.yaml``.
Write the engine to ``s3://my-bucket/cls-tf2/model.engine``. Run on the
local Docker backend.

Running Evaluation through TensorRT Engine#

Use the same specification file as the TAO evaluation specification file.

Ask the agent to run the evaluate action against the engine you built. For example:

Evaluate the TF2 Classification TensorRT engine at
``s3://my-bucket/cls-tf2/model.engine`` against ``eval-spec.yaml``. Run on
local Docker.

Running Inference through TensorRT Engine#

Use the same specification file as the TAO inference specification file.

Ask the agent to run the inference action against the engine you built. For example:

Run TF2 Classification inference with the TensorRT engine at
``s3://my-bucket/cls-tf2/model.engine`` using ``infer-spec.yaml``. Run on
your chosen backend.

CSV predictions are written to result.csv under the configured results directory.