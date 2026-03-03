Image Classification PyT#

Image Classification PyT is a PyTorch-based image-classification model included in TAO. It supports the following tasks:

train

evaluate

inference

export

distill

All above actions follow below command pattern.

TAO Client (v2 API) SPECS = $( tao-client classification_pyt get-spec --action <sub_task> --job_type experiment --id $EXPERIMENT_ID ) JOB_ID = $( tao-client classification_pyt experiment-run-action --action <sub_task> --id $EXPERIMENT_ID --specs " $SPECS " ) Required Arguments --id : The unique identifier of the experiment from which to train the model See also For information on how to create an experiment using the FTMS client, refer to the Creating an experiment section in the Remote Client documentation. TAO Launcher tao model classification_pyt <sub_task> <args_per_subtask> Where, args_per_subtask are the command-line arguments required for a given subtask. Each subtask is explained in detail in the following sections.

Preparing the Input Data Structure# See the Data Annotation Format page for more information about the data format for image classification. The train classification experiment specification consists of seven main components: model

model# Here is an example model-specification file for Image Classification PyT with a FAN backbone: TAO Client (v2 API) We first need to set the base_experiment. FILTER_PARAMS = '{"network_arch": "classification_pyt"}' $BASE_EXPERIMENTS = $( tao-client classification_pyt list-base-experiments --filter_params " $FILTER_PARAMS " ) Retrieve the PTM_ID for FAN backbone from $BASE_EXPERIMENTS before setting base_experiment. PTM_INFORMATION = "{\"base_experiment\": [ $PTM_ID ]}" tao-client classification_pyt patch-artifact-metadata --id $EXPERIMENT_ID --job_type experiment --update_info $PTM_INFORMATION Then retrieve the specifications. BASE_EXPERIMENT_ID = $( tao classification_pyt list-base-experiments | jq -r '.[0].id' ) TRAIN_SPECS = $( tao classification_pyt get-job-schema --action train --base-experiment-id $BASE_EXPERIMENT_ID | jq -r '.default' ) Get specifications from $TRAIN_SPECS. You can override values as needed. TAO Launcher model : backbone : type : "vit_large_patch14_dinov2_swiglu" pretrained_backbone_path : <path_to_pretrained_weight> freeze_backbone : True head : type : "TAOLinearClsHead" binary : False topk : [ 1 , 5 ] loss : type : CrossEntropyLoss The model parameter primarily configures the backbone and head. Parameter Datatype Default Description Supported Values backbone dict config – The configuration of the backbone. head dict config – The configuration of the head. backbone# Parameter Datatype Default Description Supported Values type

























































































































































str

























































































































































fan_small_12_p4_hybrid

























































































































































Backbone architectures



























































































































































FAN Variants fan_tiny_8_p4_hybrid, fan_small_12_p4_hybrid, fan_base_16_p4_hybrid, fan_large_16_p4_hybrid, fan_xlarge_16_p4_hybrid, fan_base_18_p16_224, fan_tiny_12_p16_224, fan_small_12_p16_224, fan_large_24_p16_224, fan_small_12_p16_224_se_attn

GCViT Variants gc_vit_xxtiny, gc_vit_xtiny, gc_vit_tiny, gc_vit_small, gc_vit_base, gc_vit_large,

FasterViT Variants faster_vit_0_224, faster_vit_1_224, faster_vit_2_224, faster_vit_3_224, faster_vit_4_224, faster_vit_5_224, faster_vit_6_224, faster_vit_4_21k_224, faster_vit_4_21k_384, faster_vit_4_21k_512, faster_vit_4_21k_768

NVCLIP Variants vit_l_14_siglip_clipa_224 vit_l_14_siglip_clipa_336 vit_h_14_siglip_clipa_224

NVDIONv2 Variants vit_large_patch14_dinov2_swiglu vit_giant_patch14_reg4_dinov2_swiglu

CRADIO Variants c_radio_p1_vit_huge_patch16_mlpnorm c_radio_p2_vit_huge_patch16_mlpnorm c_radio_p3_vit_huge_patch16_mlpnorm c_radio_v2_vit_base_patch16 c_radio_v2_vit_large_patch16 c_radio_v2_vit_huge_patch16 c_radio_v3_vit_base_patch16_reg4_dinov2 c_radio_v3_vit_large_patch16_reg4_dinov2 c_radio_v3_vit_huge_patch16_reg4_dinov2

Edgenext Variants edgenext_xx_small, edgenext_x_small, edgenext_small, edgenext_base

Mit Variants mit_b0, mit_b1, mit_b2, mit_b3, mit_b4, mit_b5

EfficientViT Variants efficientvit_b0, efficientvit_b1, efficientvit_b2, efficientvit_b3, efficientvit_l0, efficientvit_l1, efficientvit_l2, efficientvit_l3,

ConvNext Variants convnext_tiny, convnext_small, convnext_base convnext_large, convnext_xlarge

ConvNextV2 Variants convnextv2_atto, convnextv2_femto, convnextv2_pico, convnextv2_nano, convnextv2_tiny, convnextv2_small, convnextv2_base, convnextv2_large, convnextv2_huge

Swin Transformer Variants swin_tiny_224_1k, swin_base_224_22k, swin_base_384_22k, swin_large_224_22k, swin_large_384_22k

Hiera Variants hiera_tiny_224, hiera_small_224, hiera_base_224, hiera_base_plus_224, hiera_large_224, hiera_huge_224

ResNet Variants resnet_18, resnet_34, resnet_50, resnet_101, resnet_152, resnet_18d, resnet_34d, resnet_50d, resnet_101d, resnet_152d feat_downsample bool False Feature downsample for fan base backbone True,False pretrained_backbone_path str – Path to the pretrained model – freeze_backbone bool False Flag to freeze backbone True,False Note: pretrained_backbone_path supports the *.safetensors format from Hugging Face for CRADIO Variants. Foundation Models# Subset of the supported arch and the pre-train datasets. Please note that the in_channels should be updated under the head : NVCLIP Image Backbones: Arch Pretrained Dataset in_channels vit_l_14_siglip_clipa_224 NVIDIA-commercial dataset 768 vit_l_14_siglip_clipa_336 NVIDIA-commercial dataset 768 vit_h_14_siglip_clipa_224 NVIDIA-commercial dataset 1024 NVDINOv2 Image Backbones: Arch Pretrained Dataset in_channels vit_large_patch14_dinov2_swiglu NVIDIA-commercial dataset 1024 vit_giant_patch14_reg4_dinov2_swiglu NVIDIA-commercial dataset 1536 RADIO Image Backbones: Arch Pretrained Dataset in_channels c_radio_p1_vit_huge_patch16_mlpnorm NVIDIA-commercial dataset 3840 c_radio_p2_vit_huge_patch16_mlpnorm NVIDIA-commercial dataset 5120 c_radio_p3_vit_huge_patch16_mlpnorm NVIDIA-commercial dataset 3840 c_radio_v2_vit_base_patch16 NVIDIA-commercial dataset 2304 c_radio_v2_vit_large_patch16 NVIDIA-commercial dataset 3072 c_radio_v2_vit_huge_patch16 NVIDIA-commercial dataset 3840 c_radio_v3_vit_base_patch16_reg4_dinov2 NVIDIA-commercial dataset 2304 c_radio_v3_vit_large_patch16_reg4_dinov2 NVIDIA-commercial dataset 3072 c_radio_v3_vit_huge_patch16_reg4_dinov2 NVIDIA-commercial dataset 3840 head# Parameter Datatype Default Description Supported Values type string TAOLinearClsHead Type of classification head TAOLinearClsHead, LogisticRegressionHead binary bool False Flag to specify binary classification True,False in_channels int 448 Number of backbone input channels to head – topk List [1,] The number of classes >=0 loss Dict config – Loss config – custom_args

Dict

None

Any custom parameters to be passed to head (e.g.``head_init_scale`` is used for TAOLinearClsHead ) –

loss# Parameter Datatype Default Description Supported Values type str CrossEntropyLoss Loss type. CrossEntropyLoss label_smooth_val float 0.0 Label smoothing value.

Dataset Input for Classification PyT# Here is an example of dataset specification file for classification PyT: Note For FTMS Client, these parameters are set in json format. dataset : dataset : "CLDataset" root_dir : /dataset/imagenet2012 batch_size : 128 workers : 1 num_classes : 1000 img_size : 224 augmentation : mixup_cutmix : True random_flip : vflip_probability : 0 hflip_probability : 0.5 enable : True random_aug : enable : True random_erase : enable : True random_rotate : rotate_probability : 0.5 angle_list : [ 90 , 180 , 270 ] enable : False random_color : brightness : 0.4 contrast : 0.4 saturation : 0.4 enable : False with_scale_random_crop : enable : False with_random_crop : True with_random_blur : False train_dataset : images_dir : /dataset/imagenet2012/train val_dataset : images_dir : /dataset/imagenet2012/val test_dataset : images_dir : /dataset/imagenet2012/test The table below describes the configurable parameters in dataset . Parameter Datatype Default Description Supported Values root_dir str – Path to folder that contains classes.txt. dataset str – dataset class. num_classes int – The number of classes in the training data. img_size int – The input image size. batch_size int – Batch Size. workers int – Workers. shuffle bool – Shuffle dataloader. True,False augmentation dict config – Augmentation config. train_dataset dict config – Configuration for the training dataset path. train_nolabel dict config – Train Data Dataclass. val_dataset dict config – Configuration for the validation dataset path. test_dataset dict config – Configuration for the testing dataset path. augmentation# Parameter Datatype Default Description Supported Values random_flip dict config – RandomFlip augmentation config. – random_rotate dict config – RandomRotation augmentation config. – random_color dict config – RandomColor augmentation config. – random_erase dict config – RandomErase augmentation config. – random_aug dict config – RandomAug augmentation config. – with_scale_random_crop dict config – RandomCropWithScale augmentation config. – with_random_blur bool – Flag to enable with_random_blur. – with_random_crop bool – Flag to enable with_random_crop. – mean List[float] – Mean for the augmentation. – std List[float] – Standard deviation for the augmentation. – mixup_cutmix bool False Flag to enable mixup and cutmix. Not recommended for binary classification. True,False mixup_alpha float 0.4 Mixup alpha. – RandomFlip# Parameter Datatype Default Description Supported Values vflip_probability float 0.5 Vertical Flip probability. – hflip_probability float 0.5 Horizontal Flip probability. – enable bool True Flag to enable augmentation. True,False RandomRotation# Parameter Datatype Default Description Supported Values rotate_probability float 0.5 Random Rotate probability. – angle_list List[float] [90, 180, 270] Random rotate angle. – enable bool True Flag to enable augmentation. True,False RandomColor# Parameter Datatype Default Description Supported Values brightness float 0.3 Random Color Brightness. – contrast float 0.3 Random Color Contrast. – saturation float 0.3 Random Color Saturation. – hue float 0.3 Random Color Hue. – enable bool True Flag to enable Random Color. True,False color_probability float 0.5 Random Color Probability. – RandomCropWithScale# Parameter Datatype Default Description Supported Values scale_range float [1, 1.2] Random Scale range. – enable bool True Flag to enable augmentation. True,False RandomErase# Parameter Datatype Default Description Supported Values erase_probability float 0.2 Random Erase Probability. – enable bool True Flag to enable augmentation. True,False RandomAug# Parameter Datatype Default Description Supported Values enable bool True Flag to enable augmentation. True,False train_dataset# Parameter Datatype Default Description Supported Values images_dir str – Path to images directory for dataset. – val_dataset# Parameter Datatype Default Description Supported Values images_dir str – Path to images directory for dataset. – test_dataset# Parameter Datatype Default Description Supported Values images_dir str – Path to images directory for dataset. – train_nolabel# Parameter Datatype Default Description Supported Values folder_path Optional[str] – Dataset directory path. –

train# Here is an example of dataset specification file for classification PyT: Note For FTMS Client, these parameters are set in json format. Parameter Datatype Default Description Supported Values optim dict config – Optimizer config. – pretrained_model_path str None Pretrained model path. – tensorboard dict config – Configuration for the tensorboard logger. – enable_ema bool False Flag to enable EMA. True,False ema_decay float 0.998 EMA decay. – clip_grad_norm float 2.0 Gradient Norm. – num_gpus int 1 The number of GPUs to run the train job. – gpu_ids List[int] [0] List of GPU IDs to run the training on. – num_nodes int 1 Number of nodes to run the training on. – seed int 1234 The seed for the initializer in PyTorch. – num_epochs int 10 Number of epochs to run the training. – checkpoint_interval int 1 Checkpoint interval. – validation_interval int 1 Validation interval. – resume_training_checkpoint_path str None Path to the checkpoint to resume training – results_dir str None Path to where all the assets are stored. – optim# Parameter Datatype Default Description Supported Values monitor_name str val_loss Monitor Name – optim str adamw Optimizer adamw,adam,sgd lr float 0.00006 Optimizer learning rate – policy str linear Optimizer policy linear,step,cosine,multistep policy_params Dict[str, Any] {“step_size”: 30, “gamma”: 0.1, “milestones”: [10, 20]} Optimizer policy parameters linear,step,cosine,multistep momentum float 0.9 The momentum for the AdamW optimizer. – weight_decay float 0.01 The weight decay coefficient. – betas List[float] [0.9, 0.999] coefficients used for computing running averages on adamw. – skip_names List[str] [] layers names which do not need weight decay. – warmup_epochs int 0 Warmup epochs. – tensorboard# Parameter Datatype Default Description Supported Values enabled bool False Flag to enable tensorboard – infrequent_logging_frequency int 2 infrequent_logging_frequency –

evaluate# Here is an example of evaluate specification file for classification PyT: Note For FTMS Client, these parameters are set in json format. evaluate : checkpoint : /path/to/model.pth Parameter Datatype Default Description Supported Values vis_after_n_batches int 1 Visualize evaluation segmentation results after n batches. – batch_size int 8 Batch Size. – checkpoint str – Path to checkpoint file. – num_gpus int 1 The number of GPUs to run the evaluate job. – gpu_ids List[int] [0] List of GPU IDs to run the evaluate on. – num_nodes int 1 Number of nodes to run the evaluate on. – checkpoint str – Path to the checkpoint used for evaluation. – trt_engine Optional[str] None Path to the TensorRT engine to be used for evaluation. – results_dir Optional[str] None Path to where all the assets are stored. –

inference# The inference config contains the parameters related to training. They are described as follows: Note For FTMS Client, these parameters are set in json format. inference : checkpoint : ${results_dir}/train/model_latest.pth Parameter Datatype Default Description Supported Values vis_after_n_batches int 1 Visualize inference segmentation results after n batches. – batch_size int 8 Batch Size. – checkpoint str – Path to checkpoint file. – num_gpus int 1 The number of GPUs to run the inference job. – gpu_ids List[int] [0] List of GPU IDs to run the inference on. – num_nodes int 1 Number of nodes to run the inference on. – checkpoint str – Path to the checkpoint used for inference. – trt_engine Optional[str] None Path to the TensorRT engine to be used for inference. – results_dir Optional[str] None Path to where all the assets are stored. –

export# The export config contains the parameters related to export. They are described as follows: Note For FTMS Client, these parameters are set in json format. export : results_dir : "${results_dir}/export" gpu_id : 0 checkpoint : ${results_dir}/train/model_latest.pth onnx_file : "${export.results_dir}/model_latest.onnx" input_width : 224 input_height : 224 batch_size : -1 Parameter Datatype Default Description Supported Values results_dir Optional[str] None Path to where all the assets are stored. – gpu_ids int 0 The index of the GPU to build the TensorRT engine. – checkpoint str – Path to the checkpoint file to run export. – onnx_file str – Path to the onnx model file. – on_cpu bool False Flag to export CPU compatible model. True,False input_channel int 3 Number of channels in the input Tensor. 1,3 input_width int 960 Width of the input image tensor. – input_height int 544 Height of the input image tensor. – opset_version int 17 Operator set version. – batch_size int -1 The batch size of the input Tensor for the engine. –

distill# The distill config contains the parameters related to distill. They are described as follows: Note For FTMS Client, these parameters are set in json format. distill : teacher : backbone : type : "vit_large_patch14_dinov2_swiglu" pretrained_backbone_path : <pretrained_model_path> freeze_backbone : True pretrained_teacher_model_path : <pretrained_teacher_path> Parameter Datatype Default Description Supported Values teacher Dict config – Configuration hyper parameters for the teacher model. – loss_type str KL Loss function for logits distillation. KL,CE,L1,L2,FD, CS,BALANCED,MSE loss_lambda float 0.5 The weight to be applied to the distillation loss as compared to task loss. – pretrained_teacher_model_path str – Path to the pre-trained teacher model. – results_dir str – Path to where all the assets generated from a task are stored. – mode str auto Distillation mode. logits,summary spatial,auto use_mlp bool True Flag to use MLP for projection (match student and teacher dimensions). True,False mlp_hidden_size int 1024 MLP hidden size. – mlp_num_inner int 0 MLP number of inner layers. – Note We have integrated Phi-Standardization (PHI-S) in the distillation loss. PHI-S is a technique that standardizes the feature maps of the teacher model to improve the distillation performance. It is a variant of the standardization technique used in the original PHI-S paper. The distillation modes are: logits: Use model.forward() for logit distillation

summary: Use model.forward_pre_logits() for summary/cls token distillation

spatial: Use model.forward_feature_pyramid() for spatial feature distillation

auto: Automatically choose from logits, summary and spatial based on loss_type teacher# Parameter Datatype Default Description Supported Values backbone Dict config – Configuration parameters for Backbone –

Training the model# Use the tao model classification_pyt train command to train a classification pytorch model: TAO Client (v2 API) TRAIN_JOB_ID = $( tao classification_pyt create-job \ --kind experiment \ --name "classification_pyt_train" \ --action train \ --workspace-id $WORKSPACE_ID \ --specs " $TRAIN_SPECS " \ --train-datasets '["' $DATASET_ID '"]' \ --eval-dataset " $DATASET_ID " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model classification_pyt train [ -h ] -e <experiment_spec_file> [ results_dir = <global_results_dir> ] [ model.<model_option> = <model_option_value> ] [ dataset.<dataset_option> = <dataset_option_value> ] [ train.<train_option> = <train_option_value> ] [ train.gpu_ids = <gpu indices> ] [ train.num_gpus = <number of gpus> ] Required Arguments The only required argument is the path to the experiment spec: -e, --experiment_spec : The experiment specification file to set up the training experiment Optional Arguments You can set optional arguments to override the option values in the experiment spec file. -h, --help : Show this help message and exit.

model.<model_option> : The model options.

dataset.<dataset_option> : The dataset options.

train.<train_option> : The train options.

train.optim.<optim_option> : The optimizer options Note For training, evaluation, and inference, we expose two variables for each task: num_gpus and gpu_ids , which default to 1 and [0] , respectively. If both are passed, but are inconsistent, for example num_gpus = 1 , gpu_ids = [0, 1] , then they are modified to follow the setting that implies more GPUs; in the same example num_gpus is modified from 1 to 2. In some cases multi-GPU training may result in a segmentation fault. You can circumvent this by setting the enviroment variable OMP_NUM_THREADS to 1. Depending upon your model of execution, you may use the following methods to set this variable: CLI Launcher : You may set the environment variable by adding the following fields to the Envs field of your ~/.tao_mounts.json file as mentioned in bullet 3 in ths section Running the launcher. { "Envs" : [ { "variable" : "OMP_NUM_THREADSR" , "value" : "1" } }

Docker: You may set environment variables in Docker by setting the -e flag in the Docker command line. docker run -it --rm --gpus all \ -e OMP_NUM_THREADS = 1 \ -v /path/to/local/mount:/path/to/docker/mount nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt <model> train -e

Evaluating the Model# After the model has been trained using the experiment config file and by following the steps to train a model, the next step is to evaluate this model on a test set to measure the accuracy of the model. TAO includes the tao model classification_pyt evaluate command to do this. The classification app computes evaluation loss and Top-k accuracy. After training, the model is stored in your FTMS experiment’s cloud workspace. When using the TAO Launcher, it will be in the output directory of your choice results_dir . TAO Client (v2 API) EVAL_JOB_ID = $( tao classification_pyt create-job \ --kind experiment \ --name "classification_pyt_evaluate" \ --action evaluate \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --eval-dataset " $DATASET_ID " \ --specs " $EVALUATE_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher The evaluate config defines the hyperparameters of the evaluation process. The following is an example config: evaluate : checkpoint : /path/to/model.pth tao model classification_pyt evaluate [ -h ] -e <experiment_spec> evaluate.checkpoint = <model to be evaluated> results_dir = <path to results dir> [ evaluate.<evaluate_option> = <evaluate_option_value> ] [ evaluate.gpu_ids = <gpu indices> ] [ evaluate.num_gpus = <number of gpus> ] Required Arguments The following arguments are required. -e, --experiment_spec : The experiment spec file to set up the evaluation experiment

evaluate.checkpoint : The .pth model to be evaluated.

results_dir : The path where the results will be stored Optional Arguments The following arguments are optional to run the command. evaluate.<evaluate_option> : The evaluate options.

Running Inference on a Model# For classification, tao model classification_pyt inference saves a .csv file containing the image paths and the corresponding labels for multiple images. TensorRT Python inference can also be enabled. TAO Client (v2 API) INFER_JOB_ID = $( tao classification_pyt create-job \ --kind experiment \ --name "classification_pyt_inference" \ --action inference \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --inference-dataset " $DATASET_ID " \ --specs " $INFERENCE_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher inference : checkpoint : /path/to/model.pth tao model classification_pyt inference [ -h ] -e <experiment_spec_file> inference.checkpoint = <model to be inferenced> results_dir = <path to results dir> [ inference.<inference_option> = <inference_option_value> ] [ inference.gpu_ids = <gpu indices> ] [ inference.num_gpus = <number of gpus> ] Required Arguments The following arguments are required to run the command. -e, --experiment_spec : The experiment spec file to set up the inference experiment

inference.checkpoint : The .pth model to inference.

results_dir : The path where the results will be stored Optional Arguments The following arguments are optional to run the command. inference.<inference_option> : The inference options.

Distilling the model# Use the tao model classification_pyt distill command to distill a classification model or a backbone from a downstream model: TAO Client (v2 API) TRAIN_JOB_ID = $( tao classification_pyt create-job \ --kind experiment \ --name "classification_pyt_distill" \ --action distill \ --workspace-id $WORKSPACE_ID \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model classification_pyt distill [ -h ] -e <experiment_spec_file> [ results_dir = <global_results_dir> ] [ model.<model_option> = <model_option_value> ] [ dataset.<dataset_option> = <dataset_option_value> ] [ train.<train_option> = <train_option_value> ] [ train.gpu_ids = <gpu indices> ] [ train.num_gpus = <number of gpus> ] Required Arguments The only required argument is the path to the experiment spec: -e, --experiment_spec : The experiment specification file to set up the training experiment Optional Arguments You can set optional arguments to override the option values in the experiment spec file. -h, --help : Show this help message and exit.

model.<model_option> : The model options.

dataset.<dataset_option> : The dataset options.

train.<train_option> : The train options.

train.optim.<optim_option> : The optimizer options Note For training, evaluation, and inference, we expose two variables for each task: num_gpus and gpu_ids , which default to 1 and [0] , respectively. If both are passed, but are inconsistent, for example num_gpus = 1 , gpu_ids = [0, 1] , then they are modified to follow the setting that implies more GPUs; in the same example num_gpus is modified from 1 to 2. In some cases multi-GPU training may result in a segmentation fault. You can circumvent this by setting the enviroment variable OMP_NUM_THREADS to 1. Depending upon your model of execution, you may use the following methods to set this variable: CLI Launcher : You may set the environment variable by adding the following fields to the Envs field of your ~/.tao_mounts.json file as mentioned in bullet 3 in ths section Running the launcher. { "Envs" : [ { "variable" : "OMP_NUM_THREADSR" , "value" : "1" } }

Docker: You may set environment variables in Docker by setting the -e flag in the Docker command line. docker run -it --rm --gpus all \ -e OMP_NUM_THREADS = 1 \ -v /path/to/local/mount:/path/to/docker/mount nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt <model> train -e The supported distillation modes are: logits : Use model.forward() for logit distillation

summary : Use model.forward_pre_logits() for summary/cls token distillation

spatial : Use model.forward_feature_pyramid() for spatial feature distillation

auto : Automatically choose from logits, summary and spatial based on loss_type Loss Type Distillation Mode KL logits CE logits L1 summary L2 summary FD summary CS summary BALANCED spatial MSE spatial

Exporting the model# Exporting the model decouples the training process from inference and allows conversion to TensorRT engines outside the TAO environment. TensorRT engines are specific to each hardware configuration and should be generated for each unique inference environment. The exported model may be used universally across training and deployment hardware. The exported model format is referred to as .onnx . TAO Client (v2 API) EXPORT_JOB_ID = $( tao classification_pyt create-job \ --kind experiment \ --name "classification_pyt_export" \ --action export \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --specs " $EXPORT_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher The export parameter defines the hyperparameters of the export process. export : checkpoint : /path/to/model.pth onnx_file : /path/to/model.onnx opset_version : 12 verify : False input_channel : 3 input_width : 224 input_height : 224 Here’s an example of the tao classification_pyt export command: tao model classification_pyt export [ -h ] -e <experiment spec file> export.checkpoint = <model to export> export.onnx_file = <onnx path> [ export.<export_option> = <export_option_value> ] Required Arguments The following arguments are required to run the command. -e, --experiment_spec : The path to an experiment spec file

export.checkpoint : The .pth model to export.

export.onnx_file : The path where the .etlt or .onnx model is saved. Optional Arguments The following arguments are optional to run the command. export.<export_option> : The export options.

Quantization# TAO supports Post-Training Quantization (PTQ) for classification models via TAO Quant. Add a quantize section to your experiment specification (see TAO Quant documentation for schema and backend options).

Run: tao model classification_pyt quantize -e <experiment_spec_file>

Use the quantized checkpoint by setting evaluate.is_quantized: true or inference.is_quantized: true and pointing to the produced artifact under results_dir (for example, quantized_model_torchao.pth or quantized_model_modelopt.pth ). For ModelOpt artifacts, model weights are under model_state_dict . Calibration dataset (ModelOpt)# When using the modelopt backend (static PTQ), you must provide a calibration dataset. Classification uses dataset.quant_calibration_dataset with the same structure as your training/validation datasets (directory of images and an optional labels file if your pipeline requires it). TorchAO (weight-only PTQ) does not use a calibration dataset. Minimal example: quantize : backend : "modelopt" mode : "static_ptq" algorithm : "minmax" dataset : quant_calibration_dataset : images_dir : "/path/to/calib_images" # labels_file: "/path/to/labels.txt" # optional, if required by your pipeline Notes# torchao backend performs weight-only PTQ and ignores activation settings.

modelopt backend performs static PTQ for weights and activations and uses your evaluation dataloader for calibration. See also: TAO Quant overview and its Configuration and backend pages.

TensorRT Engine Generation, Validation, and INT8 Calibration# For TensorRT engine generation, validation, and INT8 calibration, refer to the TAO Deploy documentation.