Migrating from TAO Toolkit 3.x to TAO Toolkit 4.0

Container Mapping

TAO 4.0 consolidates all containers under one name with different tags. If you are using TAO directly at the container level, you need to change the name and tag to use the latest version. If you are using the TAO launcher CLI, then the containers will be upgraded automatically when you upgrade the launcher.

The old containers will not be displayed on NGC, but you can still pull them. In the future, these might be deprecated. There are two new containers and two with functionalities that are merged with other containers.

Public Display Name	Current Container Name	Current container tag (most recent)	New Display Name	New container name	New Container Tag
TAO Toolkit for CV	tao-toolkit-tf	v3.22.05-tf1.15.4-py3	TAO Toolkit	tao-toolkit	Merged into 4.0.0-tf1.15.5
	tao-toolkit-tf	v3.22.05-tf1.15.5-py3	TAO Toolkit	tao-toolkit	4.0.0-tf1.15.5
	tao-toolkit-tf	v3.22.05-beta-api	TAO Toolkit	tao-toolkit	4.0.0-api
Didn’t Exist	Didn’t exist	N/A	TAO Toolkit	tao-toolkit	4.0.0-tf2.9.1 (new)
TAO Toolkit for ConvAI	tao-toolkit-pyt	v3.22.05-py3	TAO Toolkit	tao-toolkit	4.0.0-pyt
TAO Toolkit for Language model	tao-toolkit-lm	v3.22.05-py3	TAO Toolkit	tao-toolkit	Merged into 4.0.0-pyt
Didn’t Exist	Didn’t exist	N/A	TAO Toolkit	tao-toolkit	4.0.0-deploy (new)

TAO Model Export and INT8 Calibration Changes

There are minor interface changes from TAO Toolkit 3.x (21.08, 21.11, 22.02, 22.05) to TAO Toolkit 4.0. This may affect you if you are using older notebooks, have the TAO workflow integrated into your own applications, or are training directly in the containers. If you use the newer notebooks from the TAO Getting Started, then this doesn’t apply, as these notebooks have already been updated.

TAO 4.0 has disaggregated the hybrid training-deployment container to separate training and deployment containers. Since the libraries for training and deployment are completely different, this allows for rapid development and updates to individual components.

The training container contains deep learning frameworks like TensorFlow and PyTorch, but the libraries and entrypoint to make the trained models deploy/inference ready has now been moved to the new deploy container. The deploy container now handles the generation of TensorRT engine and INT8 calibration caches, as well as TensorRT model evaluation and inference.

The image below highlights the changes related to INT8 calibration generation and TensorRT model evaluation. If you are training directly from the containers, then you will need to separately pull the tao-deploy container to run TensorRT conversion and evaluation. If you are using the launcher CLI or API, then this will be handled automatically by the CLI or API.

TAO TensorFlow1 Training container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
TAO TensorFlow1 Training container for MaskRCNN and UNet: nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5
TAO TensorFlow2 Training container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
TAO PyTorch Training Container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-pyt
TAO deploy container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-deploy

This change only effects the models in the table below. For other models, the deploy artifacts are still contained in the training container and will be migrated out in the future.

TensorFlow 1.x	TensorFlow 2.x	PyTorch
Classification	Classification	Deformable DETR
DetectNet_v2	EfficientDet	Segformer
DSSD
EfficientDet
Faster RCNN
LPRNet
Mask RCNN
Multitask Classification
RetinaNet
SSD
UNet
YOLOv3
YOLOv4
YOLOv4_tiny

The detailed changes per network are provided in the table below. The commands are taken from the TAO Jupyter notebooks. Most representative networks have been included, and models introduced in 4.0 are not included.

Network	TAO Toolkit 3.x (21.08, 21.11, 22.02, 22.05)	TAO Toolkit 4.0
Classification	Copy Copied! tao classification export \ -m $USER_EXPERIMENT_DIR/output_retrain/weights/resnet_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/final_model.etlt \ -k $KEY \ --cal_data_file $USER_EXPERIMENT_DIR/export/calibration.tensor \ --data_type int8 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \ --classmap_json $USER_EXPERIMENT_DIR/output_retrain/classmap.json \ --gen_ds_config tao converter $USER_EXPERIMENT_DIR/export/final_model.etlt \ -k $KEY \ -c $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \ -o predictions/Softmax \ -d 3,224,224 \ -i nchw \ -m 64 -t int8 \ -e $USER_EXPERIMENT_DIR/export/final_model.trt \ -b 64	Copy Copied! tao classification_tf1 export \ -m $USER_EXPERIMENT_DIR/output_retrain/weights/resnet_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/final_model.etlt \ -k $KEY \ --classmap_json $USER_EXPERIMENT_DIR/output_retrain/classmap.json \ --gen_ds_config tao-deploy classification_tf1 gen_trt_engine \ -m $USER_EXPERIMENT_DIR/export/final_model.etlt \ -e $SPECS_DIR/classification_retrain_spec.cfg \ -k $KEY \ --batch_size 64 \ --max_batch_size 64 \ --batches 10 \ --data_type int8 \ --cal_data_file $USER_EXPERIMENT_DIR/export/calibration.tensor \ --cal_cache_file $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \ --cal_image_dir $DATA_DOWNLOAD_DIR/split/test/ \ --engine_file $USER_EXPERIMENT_DIR/export/final_model.trt
DetectNet_v2	Copy Copied! tao detectnet_v2 export \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ --cal_data_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor \ --data_type int8 \ --batches 10 \ --batch_size 4 \ --max_batch_size 4\ --engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \ --cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \ --verbose tao converter $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ -c $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \ -o output_cov/Sigmoid,output_bbox/BiasAdd \ -d 3,384,1248 \ -i nchw \ -m 64 \ -t int8 \ -e $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt \ -b 4	Copy Copied! tao detectnet_v2 export \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \ -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \ -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ --gen_ds_config tao-deploy detectnet_v2 gen_trt_engine \ -m $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ --data_type int8 \ --batches 10 \ --batch_size 4 \ --max_batch_size 64 \ --engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \ --cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \ -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \ --verbose
EfficientDet	Copy Copied! tao efficientdet export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \ -k $KEY \ -e $SPECS_DIR/efficientdet_d0_retrain.txt \ --batch_size 8 \ --data_type int8 \ --cal_image_dir $DATA_DOWNLOAD_DIR/raw-data/val2017 \ --batches 10 \ --max_batch_size 1 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/efficientdet_d0.cal tao converter -k $KEY \ -c $USER_EXPERIMENT_DIR/export/trt.int8.cal \ -p image_arrays:0,1x512x512x3,8x512x512x3,16x512x512x3 \ -e $USER_EXPERIMENT_DIR/export/trt.int8.engine \ -t int8 \ -b 8 \ $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt	Copy Copied! tao efficientdet_tf1 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \ -k $KEY \ -e $SPECS_DIR/efficientdet_d0_retrain.txt tao-deploy efficientdet_tf1 gen_trt_engine -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \ -k $KEY \ --batch_size 8 \ --data_type int8 \ --cal_image_dir $DATA_DOWNLOAD_DIR/raw-data/val2017 \ --batches 10 \ --min_batch_size 1 \ --opt_batch_size 8 \ --max_batch_size 16 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/efficientdet_d0.cal \ --engine_file $USER_EXPERIMENT_DIR/export/trt.int8.engine
SSD	Copy Copied! tao ssd export --gpu_index=$GPU_INDEX \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/ssd_resnet18_epoch_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \ -k $KEY \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \ --gen_ds_config tao converter -k $KEY \ -d 3,300,300 \ -o NMS \ -c $USER_EXPERIMENT_DIR/export/cal.bin \ -e $USER_EXPERIMENT_DIR/export/trt.engine \ -b 8 \ -m 16 \ -t int8 \ -i nchw \ $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt	Copy Copied! tao ssd export --gpu_index=$GPU_INDEX \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/ssd_resnet18_epoch_$EPOCH.tlt \ -k $KEY \ -o $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \ --batch_size 16 \ --gen_ds_config tao-deploy ssd gen_trt_engine --gpu_index=$GPU_INDEX \ -m $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \ -k $KEY \ -e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \ --engine_file $USER_EXPERIMENT_DIR/export/trt.engine \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --max_batch_size 16 \ --batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile
UNet	Copy Copied! tao unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.tlt \ -k $KEY \ -e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \ --data_type int8 \ --engine_file $USER_EXPERIMENT_DIR/export/int8.isbi.retrained.engine \ --data_type int8 \ --cal_data_file $USER_EXPERIMENT_DIR/export/isbi_cal_data_file.txt \ --cal_cache_file $USER_EXPERIMENT_DIR/export/isbi_cal.bin \ --cal_image_dir $DATA_DOWNLOAD_DIR/isbi/images/val \ --max_batch_size 3 \ --batch_size 1 \ --gen_ds_config tao converter -k $KEY \ -c $USER_EXPERIMENT_DIR/export/isbi_cal.bin \ -e $USER_EXPERIMENT_DIR/export/trt.int8.tlt.isbi.engine \ -i nchw \ -t int8 \ -p input_1:0,1x1x320x320,4x1x320x320,16x1x320x320 \ $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.etlt	Copy Copied! tao unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.tlt \ -k $KEY \ -e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \ --gen_ds_config tao-deploy unet gen_trt_engine --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.etlt \ -k $KEY \ -e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \ --data_type int8 \ --engine_file $USER_EXPERIMENT_DIR/export/int8.isbi.retrained.engine \ --data_type int8 \ --cal_data_file $USER_EXPERIMENT_DIR/export/isbi_cal_data_file.txt \ --cal_cache_file $USER_EXPERIMENT_DIR/export/isbi_cal.bin \ --cal_image_dir $DATA_DOWNLOAD_DIR/isbi/images/val \ --max_batch_size 3 \ --batch_size 1
YOLOv3	Copy Copied! tao yolo_v3 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov3_resnet18_epoch_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \ -k $KEY \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \ --gen_ds_config tao converter -k $KEY \ -p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \ -c $USER_EXPERIMENT_DIR/export/cal.bin \ -e $USER_EXPERIMENT_DIR/export/trt.engine \ -b 8 \ -t int8 \ $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt	Copy Copied! tao yolo_v3 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov3_resnet18_epoch_$EPOCH.tlt \ -k $KEY \ -o $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \ -e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \ --gen_ds_config tao-deploy yolo_v3 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \ -k $KEY \ -e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \ --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \ --data_type int8 \ --batch_size 16 \ --min_batch_size 1 \ --opt_batch_size 8 \ --max_batch_size 16 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \ --engine_file $USER_EXPERIMENT_DIR/export/trt.engine.int8