Migrating from TAO Toolkit 3.x to TAO Toolkit 4.0

Container Mapping

TAO 4.0 consolidates all containers under one name with different tags. If you are using TAO directly at the container level, you need to change the name and tag to use the latest version. If you are using the TAO launcher CLI, then the containers will be upgraded automatically when you upgrade the launcher.

The old containers will not be displayed on NGC, but you can still pull them. In the future, these might be deprecated. There are two new containers and two with functionalities that are merged with other containers.

Public Display Name

Current Container Name

Current container tag (most recent)

New Display Name

New container name

New Container Tag

TAO Toolkit for CV tao-toolkit-tf v3.22.05-tf1.15.4-py3 TAO Toolkit tao-toolkit Merged into 4.0.0-tf1.15.5
tao-toolkit-tf v3.22.05-tf1.15.5-py3 TAO Toolkit tao-toolkit 4.0.0-tf1.15.5
tao-toolkit-tf v3.22.05-beta-api TAO Toolkit tao-toolkit 4.0.0-api
Didn’t Exist Didn’t exist N/A TAO Toolkit tao-toolkit 4.0.0-tf2.9.1 (new)
TAO Toolkit for ConvAI tao-toolkit-pyt v3.22.05-py3 TAO Toolkit tao-toolkit 4.0.0-pyt
TAO Toolkit for Language model tao-toolkit-lm v3.22.05-py3 TAO Toolkit tao-toolkit Merged into 4.0.0-pyt
Didn’t Exist Didn’t exist N/A TAO Toolkit tao-toolkit 4.0.0-deploy (new)

There are minor interface changes from TAO Toolkit 3.x (21.08, 21.11, 22.02, 22.05) to TAO Toolkit 4.0. This may affect you if you are using older notebooks, have the TAO workflow integrated into your own applications, or are training directly in the containers. If you use the newer notebooks from the TAO Getting Started, then this doesn’t apply, as these notebooks have already been updated.

TAO 4.0 has disaggregated the hybrid training-deployment container to separate training and deployment containers. Since the libraries for training and deployment are completely different, this allows for rapid development and updates to individual components.

The training container contains deep learning frameworks like TensorFlow and PyTorch, but the libraries and entrypoint to make the trained models deploy/inference ready has now been moved to the new deploy container. The deploy container now handles the generation of TensorRT engine and INT8 calibration caches, as well as TensorRT model evaluation and inference.

The image below highlights the changes related to INT8 calibration generation and TensorRT model evaluation. If you are training directly from the containers, then you will need to separately pull the tao-deploy container to run TensorRT conversion and evaluation. If you are using the launcher CLI or API, then this will be handled automatically by the CLI or API.

tao_deploy_workflow.jpg

  • TAO TensorFlow1 Training container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5

  • TAO TensorFlow1 Training container for MaskRCNN and UNet: nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5

  • TAO TensorFlow2 Training container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1

  • TAO PyTorch Training Container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-pyt

  • TAO deploy container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-deploy

This change only effects the models in the table below. For other models, the deploy artifacts are still contained in the training container and will be migrated out in the future.

TensorFlow 1.x

TensorFlow 2.x

PyTorch

Classification Classification Deformable DETR
DetectNet_v2 EfficientDet Segformer
DSSD
EfficientDet
Faster RCNN
LPRNet
Mask RCNN
Multitask Classification
RetinaNet
SSD
UNet
YOLOv3
YOLOv4
YOLOv4_tiny

The detailed changes per network are provided in the table below. The commands are taken from the TAO Jupyter notebooks. Most representative networks have been included, and models introduced in 4.0 are not included.

Network

TAO Toolkit 3.x (21.08, 21.11, 22.02, 22.05)

TAO Toolkit 4.0

Classification
Copy
Copied!
            

tao classification export \ -m $USER_EXPERIMENT_DIR/output_retrain/weights/resnet_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/final_model.etlt \ -k $KEY \ --cal_data_file $USER_EXPERIMENT_DIR/export/calibration.tensor \ --data_type int8 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \ --classmap_json $USER_EXPERIMENT_DIR/output_retrain/classmap.json \ --gen_ds_config tao converter $USER_EXPERIMENT_DIR/export/final_model.etlt \ -k $KEY \ -c $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \ -o predictions/Softmax \ -d 3,224,224 \ -i nchw \ -m 64 -t int8 \ -e $USER_EXPERIMENT_DIR/export/final_model.trt \ -b 64

Copy
Copied!
            

tao classification_tf1 export \ -m $USER_EXPERIMENT_DIR/output_retrain/weights/resnet_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/final_model.etlt \ -k $KEY \ --classmap_json $USER_EXPERIMENT_DIR/output_retrain/classmap.json \ --gen_ds_config tao-deploy classification_tf1 gen_trt_engine \ -m $USER_EXPERIMENT_DIR/export/final_model.etlt \ -e $SPECS_DIR/classification_retrain_spec.cfg \ -k $KEY \ --batch_size 64 \ --max_batch_size 64 \ --batches 10 \ --data_type int8 \ --cal_data_file $USER_EXPERIMENT_DIR/export/calibration.tensor \ --cal_cache_file $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \ --cal_image_dir $DATA_DOWNLOAD_DIR/split/test/ \ --engine_file $USER_EXPERIMENT_DIR/export/final_model.trt

DetectNet_v2
Copy
Copied!
            

tao detectnet_v2 export \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ --cal_data_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor \ --data_type int8 \ --batches 10 \ --batch_size 4 \ --max_batch_size 4\ --engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \ --cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \ --verbose tao converter $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ -c $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \ -o output_cov/Sigmoid,output_bbox/BiasAdd \ -d 3,384,1248 \ -i nchw \ -m 64 \ -t int8 \ -e $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt \ -b 4

Copy
Copied!
            

tao detectnet_v2 export \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \ -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \ -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ --gen_ds_config tao-deploy detectnet_v2 gen_trt_engine \ -m $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ --data_type int8 \ --batches 10 \ --batch_size 4 \ --max_batch_size 64 \ --engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \ --cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \ -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \ --verbose

EfficientDet
Copy
Copied!
            

tao efficientdet export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \ -k $KEY \ -e $SPECS_DIR/efficientdet_d0_retrain.txt \ --batch_size 8 \ --data_type int8 \ --cal_image_dir $DATA_DOWNLOAD_DIR/raw-data/val2017 \ --batches 10 \ --max_batch_size 1 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/efficientdet_d0.cal tao converter -k $KEY \ -c $USER_EXPERIMENT_DIR/export/trt.int8.cal \ -p image_arrays:0,1x512x512x3,8x512x512x3,16x512x512x3 \ -e $USER_EXPERIMENT_DIR/export/trt.int8.engine \ -t int8 \ -b 8 \ $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt

Copy
Copied!
            

tao efficientdet_tf1 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \ -k $KEY \ -e $SPECS_DIR/efficientdet_d0_retrain.txt tao-deploy efficientdet_tf1 gen_trt_engine -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \ -k $KEY \ --batch_size 8 \ --data_type int8 \ --cal_image_dir $DATA_DOWNLOAD_DIR/raw-data/val2017 \ --batches 10 \ --min_batch_size 1 \ --opt_batch_size 8 \ --max_batch_size 16 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/efficientdet_d0.cal \ --engine_file $USER_EXPERIMENT_DIR/export/trt.int8.engine

SSD
Copy
Copied!
            

tao ssd export --gpu_index=$GPU_INDEX \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/ssd_resnet18_epoch_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \ -k $KEY \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \ --gen_ds_config tao converter -k $KEY \ -d 3,300,300 \ -o NMS \ -c $USER_EXPERIMENT_DIR/export/cal.bin \ -e $USER_EXPERIMENT_DIR/export/trt.engine \ -b 8 \ -m 16 \ -t int8 \ -i nchw \ $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt

Copy
Copied!
            

tao ssd export --gpu_index=$GPU_INDEX \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/ssd_resnet18_epoch_$EPOCH.tlt \ -k $KEY \ -o $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \ --batch_size 16 \ --gen_ds_config tao-deploy ssd gen_trt_engine --gpu_index=$GPU_INDEX \ -m $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \ -k $KEY \ -e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \ --engine_file $USER_EXPERIMENT_DIR/export/trt.engine \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --max_batch_size 16 \ --batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile

UNet
Copy
Copied!
            

tao unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.tlt \ -k $KEY \ -e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \ --data_type int8 \ --engine_file $USER_EXPERIMENT_DIR/export/int8.isbi.retrained.engine \ --data_type int8 \ --cal_data_file $USER_EXPERIMENT_DIR/export/isbi_cal_data_file.txt \ --cal_cache_file $USER_EXPERIMENT_DIR/export/isbi_cal.bin \ --cal_image_dir $DATA_DOWNLOAD_DIR/isbi/images/val \ --max_batch_size 3 \ --batch_size 1 \ --gen_ds_config tao converter -k $KEY \ -c $USER_EXPERIMENT_DIR/export/isbi_cal.bin \ -e $USER_EXPERIMENT_DIR/export/trt.int8.tlt.isbi.engine \ -i nchw \ -t int8 \ -p input_1:0,1x1x320x320,4x1x320x320,16x1x320x320 \ $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.etlt

Copy
Copied!
            

tao unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.tlt \ -k $KEY \ -e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \ --gen_ds_config tao-deploy unet gen_trt_engine --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.etlt \ -k $KEY \ -e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \ --data_type int8 \ --engine_file $USER_EXPERIMENT_DIR/export/int8.isbi.retrained.engine \ --data_type int8 \ --cal_data_file $USER_EXPERIMENT_DIR/export/isbi_cal_data_file.txt \ --cal_cache_file $USER_EXPERIMENT_DIR/export/isbi_cal.bin \ --cal_image_dir $DATA_DOWNLOAD_DIR/isbi/images/val \ --max_batch_size 3 \ --batch_size 1

YOLOv3
Copy
Copied!
            

tao yolo_v3 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov3_resnet18_epoch_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \ -k $KEY \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \ --gen_ds_config tao converter -k $KEY \ -p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \ -c $USER_EXPERIMENT_DIR/export/cal.bin \ -e $USER_EXPERIMENT_DIR/export/trt.engine \ -b 8 \ -t int8 \ $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt

Copy
Copied!
            

tao yolo_v3 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov3_resnet18_epoch_$EPOCH.tlt \ -k $KEY \ -o $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \ --gen_ds_config tao-deploy yolo_v3 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \ -k $KEY \ -e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --batch_size 16 \ --min_batch_size 1 \ --opt_batch_size 8 \ --max_batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \ --engine_file $USER_EXPERIMENT_DIR/export/trt.engine.int8

© Copyright 2023, NVIDIA.. Last updated on Sep 5, 2023.