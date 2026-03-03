ReIdentificationNet takes cropped images of a person from different perspectives as network input and outputs the embedding features for that person. The embeddings are used to perform similarity matching to re-identify the same person. The model supported in the current version is based on ResNet, which is the most commonly used baseline for re-identification due to its high accuracy.

The expected time to train ReIdentificationNet is as follows:

Backbone Type GPU Type No. of training images Image Size No. of identities Batch size Total Epochs Total Training Time Resnet50 1 x Nvidia A100 - 80GB PCIE 13,000 256x128x3 751 128 120 ~1.25 hours Resnet50 1 x Nvidia Quadro GV100 - 32GB 13,000 256x128x3 751 64 120 ~2.5 hours

Note Throughout this documentation are references to $EXPERIMENT_ID and $DATASET_ID in the FTMS Client sections. For instructions on creating a dataset using the remote client, refer to the Creating a dataset section in the Remote Client documentation. For instructions on creating an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.

The spec format is YAML for TAO Launcher, and JSON for FTMS Client.

File-related parameters, such as dataset paths or pretrained model paths, are required only for TAO Launcher, not for FTMS Client.

Data Input for ReIdentificationNet# The ReIdentificationNet apps in TAO expect data in Market-1501 format for training and evaluation. See the Data Annotation Format page for more information about the Market-1501 data format.

Creating an Experiment Spec File# The spec file for ReIdentificationNet includes model , dataset , re_ranking , and train parameters. Here is an example spec for training a ResNet model on Market-1501 that contains 751 identities in the training set. TAO Client (v2 API) Use the following command to get an experiment spec file for ReIdentificationNet: BASE_EXPERIMENT_ID = $( tao re_identification list-base-experiments | jq -r '.[0].id' ) SPECS = $( tao re_identification get-job-schema --action train --base-experiment-id $BASE_EXPERIMENT_ID | jq -r '.default' ) TAO Launcher results_dir : "/path/to/experiment_results" encryption_key : nvidia_tao model : backbone : resnet_50 last_stride : 1 pretrain_choice : imagenet pretrained_model_path : "/path/to/pretrained_model.pth" input_channels : 3 input_width : 128 input_height : 256 neck : bnneck feat_dim : 256 neck_feat : after metric_loss_type : triplet with_center_loss : False with_flip_feature : False label_smooth : True dataset : train_dataset_dir : "/path/to/train_dataset_dir" test_dataset_dir : "/path/to/test_dataset_dir" query_dataset_dir : "/path/to/query_dataset_dir" num_classes : 751 batch_size : 64 val_batch_size : 128 num_workers : 1 pixel_mean : [ 0.485 , 0.456 , 0.406 ] pixel_std : [ 0.226 , 0.226 , 0.226 ] padding : 10 prob : 0.5 re_prob : 0.5 sampler : softmax_triplet num_instances : 4 re_ranking : re_ranking : True k1 : 20 k2 : 6 lambda_value : 0.3 train : results_dir : "${results_dir}/train" optim : name : Adam lr_monitor : val_loss steps : [ 40 , 70 ] gamma : 0.1 bias_lr_factor : 1 weight_decay : 0.0005 weight_decay_bias : 0.0005 warmup_factor : 0.01 warmup_iters : 10 warmup_method : linear base_lr : 0.00035 momentum : 0.9 center_loss_weight : 0.0005 center_lr : 0.5 triplet_loss_margin : 0.3 num_epochs : 10 checkpoint_interval : 5 validation_interval : 5 seed : 1234 Parameter Data Type Default Description Supported Values model dict config – The configuration of the model architecture dataset dict config – The configuration of the dataset train dict config – The configuration of the training task evaluate dict config – The configuration of the evaluation task inference dict config – The configuration of the inference task encryption_key string None The encryption key to encrypt and decrypt model files results_dir string /results The directory where experiment results are saved export dict config – The configuration of the ONNX export task re_ranking dict config – The configuration for the re-ranking module model# The model parameter provides options to change the ReIdentificationNet architecture. model : backbone : resnet_50 last_stride : 1 pretrain_choice : imagenet pretrained_model_path : "/path/to/pretrained_model.pth" input_channels : 3 input_width : 128 input_height : 256 neck : bnneck feat_dim : 256 neck_feat : after metric_loss_type : triplet with_center_loss : False with_flip_feature : False label_smooth : True Parameter Datatype Default Description Supported Values backbone string resnet_50 The type of model, which can be resnet_50 or a Swin-based architecture (refer to ReIdentificationNet Transformer for more details) “resnet_50”, “swin_base_patch4_window7_224”, “swin_small_patch4_window7_224, “swin_tiny_patch4_window7_224” last_stride unsigned int 1 The number of strides during convolution >0 pretrain_choice string imagenet The pre-trained network imagenet/self/”” pretrained_model_path string The path to the pre-trained model input_channels unsigned int 3 The number of input channels >0 input_width int 128 The width of the input images >0 input_height int 256 The height of the input images >0 neck string bnneck Specifies whether to train with BNNeck bnneck/”” feat_dim unsigned int 256 The output size of the feature embeddings >0 neck_feat string after Specifies which feature of BNNeck to use for testing before/after metric_loss_type string triplet The type of metric loss triplet/center/triplet_center with_center_loss bool False Specifies whether to enable center loss True/False with_flip_feature bool False Specifies whether to enable image flipping True/False label_smooth bool True Specifies whether to enable label smoothing True/False dataset# The dataset parameter defines the dataset source, training batch size, and augmentation. dataset : train_dataset_dir : "/path/to/train_dataset_dir" test_dataset_dir : "/path/to/test_dataset_dir" query_dataset_dir : "/path/to/query_dataset_dir" num_classes : 751 batch_size : 64 val_batch_size : 128 num_workers : 1 pixel_mean : [ 0.485 , 0.456 , 0.406 ] pixel_std : [ 0.226 , 0.226 , 0.226 ] padding : 10 prob : 0.5 re_prob : 0.5 sampler : softmax_triplet num_instances : 4 Parameter Datatype Default Description Supported Values train_dataset_dir string The path to the train images test_dataset_dir string The path to the test images query_dataset_dir string The path to the query images num_classes unsigned int 751 The number of unique person IDs >0 batch_size unsigned int 64 The batch size for training >0 val_batch_size unsigned int 128 The batch size for validation >0 num_workers unsigned int 1 The number of parallel workers processing data >0 pixel_mean float list [0.485, 0.456, 0.406] The pixel mean for image normalization float list pixel_std float list [0.226, 0.226, 0.226] The pixel standard deviation for image normalization float list padding unsigned int 10 The pixel padding size around images for image augmentation >=1 prob float 0.5 The random horizontal flipping probability for image augmentation >0 re_prob float 0.5 The random erasing probability for image augmentation >0 sampler string softmax_triplet The type of sampler for data loading softmax/triplet/softmax_triplet num_instances unsigned int 4 The number of image instances of the same person in a batch >0 re_ranking# The re_ranking parameter defines the settings for the re-ranking module. re_ranking : re_ranking : True k1 : 20 k2 : 6 lambda_value : 0.3 Parameter Datatype Default Description Supported Values re_ranking bool True A flag that enables the re-ranking module True/False k1 unsigned int 20 The k used for k-reciprocal nearest neighbors >0 k2 unsigned int 6 The k used for local query expansion >0 lambda_value float 0.3 The weight of original distance in the combination with Jaccard distance >0.0 train# The train parameter defines the hyperparameters of the training process. train : optim : name : Adam lr_monitor : val_loss steps : [ 40 , 70 ] gamma : 0.1 bias_lr_factor : 1 weight_decay : 0.0005 weight_decay_bias : 0.0005 warmup_factor : 0.01 warmup_iters : 10 warmup_method : linear base_lr : 0.00035 momentum : 0.9 center_loss_weight : 0.0005 center_lr : 0.5 triplet_loss_margin : 0.3 num_epochs : 10 checkpoint_interval : 5 validation_interval : 5 seed : 1234 Parameter Datatype Default Description Supported Values num_gpus unsigned int 1 The number of GPUs to use for distributed training >0 gpu_ids List[int] [0] The indices of the GPU’s to use for distributed training seed unsigned int 1234 The random seed for random, NumPy, and torch >0 num_epochs unsigned int 10 The total number of epochs to run the experiment >0 checkpoint_interval unsigned int 1 The epoch interval at which the checkpoints are saved >0 validation_interval unsigned int 1 The epoch interval at which the validation is run >0 resume_training_checkpoint_path string The intermediate PyTorch Lightning checkpoint to resume training from results_dir string /results/train The directory to save training results optim dict config The configuration for the SGD optimizer, including the learning rate, learning scheduler, weight decay, etc. clip_grad_norm float 0.0 The amount to clip the gradient by the L2 norm. A value of 0.0 specifies no clipping. >=0 optim# The optim parameter defines the config for the SGD optimizer in training, including the learning rate, learning scheduler, and weight decay. optim : name : Adam lr_monitor : val_loss lr_steps : [ 40 , 70 ] gamma : 0.1 bias_lr_factor : 1 weight_decay : 0.0005 weight_decay_bias : 0.0005 warmup_factor : 0.01 warmup_iters : 10 warmup_method : linear base_lr : 0.00035 momentum : 0.9 center_loss_weight : 0.0005 center_lr : 0.5 triplet_loss_margin : 0.3 Parameter Datatype Default Description Supported Values name string Adam The name of the optimizer Adam/SGD/Adamax/… lr_monitor string val_loss The monitor value for the AutoReduce scheduler val_loss/train_loss lr_steps int list [40, 70] The steps to decrease the learning rate for the MultiStep scheduler int list gamma float 0.1 The decay rate for the WarmupMultiStepLR >0.0 bias_lr_factor float 1 The bias learning rate factor for the WarmupMultiStepLR >=1 weight_decay float 0.0005 The weight decay coefficient for the optimizer >0.0 weight_decay_bias float 0.0005 The weight decay bias for the optimizer >0.0 warmup_factor float 0.01 The warmup factor for the WarmupMultiStepLR scheduler >0.0 warmup_iters unsigned int 10 The number of warmup iterations for the WarmupMultiStepLR scheduler >0 warmup_method string linear The warmup method for the optimizer linear/cosine base_lr float 0.00035 The initial learning rate for the training >0.0 momentum float 0.9 The momentum for the WarmupMultiStepLR optimizer >0.0 center_loss_weight float 0.0005 The balanced weight of center loss >0.0 center_lr float 0.5 The learning rate of SGD to learn the centers of center loss >0.0 triplet_loss_margin float 0.3 The margin value for triplet loss >0.0

Training the Model# Use the following command to run ReIdentificationNet training: TAO Client (v2 API) TRAIN_JOB_ID = $( tao re_identification create-job \ --kind experiment \ --name "re_identification_train" \ --action train \ --workspace-id $WORKSPACE_ID \ --specs " $TRAIN_SPECS " \ --train-datasets '["' $DATASET_ID '"]' \ --eval-dataset " $DATASET_ID " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model re_identification train [ -h ] -e <experiment_spec> [ results_dir = <global_results_dir> ] [ model.<model_option> = <model_option_value> ] [ dataset.<dataset_option> = <dataset_option_value> ] [ train.<train_option> = <train_option_value> ] [ train.gpu_ids = <gpu indices> ] [ train.num_gpus = <number of gpus> ] Required Arguments The following arguments are required. -e, --experiment_spec_file : The path to the experiment spec file. Optional Arguments You can set optional arguments to override the option values in the experiment spec file. -h, --help : Show this help message and exit.

model.<model_option> : The model options.

dataset.<dataset_option> : The dataset options.

re_ranking.<rerank_option> : The re-ranking options.

train.<train_option> : The train options.

train.optim.<optim_option> : The optimizer options Note For training, evaluation, and inference, we expose two variables for each task: num_gpus and gpu_ids , which default to 1 and [0] , respectively. If both are passed, but are inconsistent, for example num_gpus = 1 , gpu_ids = [0, 1] , then they are modified to follow the setting that implies more GPUs; in the same example num_gpus is modified from 1 to 2. In some cases multi-GPU training may result in a segmentation fault. You can circumvent this by setting the enviroment variable OMP_NUM_THREADS to 1. Depending upon your model of execution, you may use the following methods to set this variable: CLI Launcher : You may set the environment variable by adding the following fields to the Envs field of your ~/.tao_mounts.json file as mentioned in bullet 3 in ths section Running the launcher. { "Envs" : [ { "variable" : "OMP_NUM_THREADSR" , "value" : "1" } }

Docker: You may set environment variables in Docker by setting the -e flag in the Docker command line. docker run -it --rm --gpus all \ -e OMP_NUM_THREADS = 1 \ -v /path/to/local/mount:/path/to/docker/mount nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt <model> train -e Checkpointing and Resuming Training At every train.checkpoint_interval , a PyTorch Lightning checkpoint is saved. It is called model_epoch_<epoch_num>.pth . Checkpoints are saved in train.results_dir , like this: $ ls /results/train 'model_epoch_000.pth' 'model_epoch_001.pth' 'model_epoch_002.pth' 'model_epoch_003.pth' 'model_epoch_004.pth' The latest checkpoint is saved as reid_model_latest.pth . Training automatically resumes from reid_model_latest.pth , if it exists in train.results_dir . This is superseded by train.resume_training_checkpoint_path , if it is provided. The major implication of this logic is that, if you wish to trigger fresh training from scratch, either: Specify a new, empty results directory (Recommended)

Remove the latest checkpoint from the results directory

Evaluating the Model# The evaluation metric of ReIdentificationNet is the mean average precision and ranked accuracy. The plots of sampled matches and the cumulative matching characteristic (CMC) curve can be obtained using the evaluate.output_sampled_matches_plot and evaluate.output_cmc_curve_plot parameters, respectively. Use the following command to run ReIdentificationNet evaluation: TAO Client (v2 API) TRAIN_JOB_ID = $( tao re_identification create-job \ --kind experiment \ --name "re_identification_evaluate" \ --action evaluate \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --eval-dataset " $DATASET_ID " \ --specs " $EVALUATE_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model re_identification evaluate [ -h ] -e <experiment_spec_file> evaluate.checkpoint = <model to be evaluated> evaluate.output_sampled_matches_plot = <path to the output sampled matches plot> evaluate.output_cmc_curve_plot = <path to the output CMC curve plot> evaluate.test_dataset = <path to test data> evaluate.query_dataset = <path to query data> [ evaluate.<evaluate_option> = <evaluate_option_value> ] [ evaluate.gpu_ids = <gpu indices> ] [ evaluate.num_gpus = <number of gpus> ] Required Arguments The following arguments are required. -e, --experiment_spec_file : The experiment spec file to set up the evaluation experiment

evaluate.checkpoint : The .pth model

evaluate.output_sampled_matches_plot : The path to the plotted file of sampled matches

evaluate.output_cmc_curve_plot : The path to the plotted file of the CMC curve

evaluate.test_dataset : The path to the test data

evaluate.query_dataset : The path to the query data Optional Arguments evaluate.gpu_ids : The GPU indices to run evaluation. Defaults to [0] .

evaluate.num_gpus : The number of GPUs to run evaluation. Defualts to 1 .

evaluate.results_dir : The directory to save the evaluation results. Defaults to /results/evaluate . Multi-GPU evaluation is not supported for Re-Identification.

Running Inference on the Model# Use the following command to run inference on ReIdentificationNet with the .tlt model. TAO Client (v2 API) TRAIN_JOB_ID = $( tao re_identification create-job \ --kind experiment \ --name "re_identification_inference" \ --action inference \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --inference-dataset " $DATASET_ID " \ --specs " $INFERENCE_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model re_identification inference [ -h ] -e <experiment_spec> inference.checkpoint = <inference model> inference.output_file = <path to output file> inference.test_dataset = <path to gallery data> inference.query_dataset = <path to query data> [ inference.<infer_option> = <infer_option_value> ] [ inference.gpu_ids = <gpu indices> ] [ inference.num_gpus = <number of gpus> ] Required Arguments The following arguments are required. -e, --experiment_spec : The experiment spec file to set up inference

inference.checkpoint : The .pth model to perform inference with

inference.output_file : The path to the output JSON file

inference.test_dataset : The path to the test data

inference.query_dataset : The path to the query data Optional Arguments inference.gpu_ids : The GPU indices to run inference. Defaults to [0] .

inference.num_gpus : The number of GPUs to run inference. Defualts to 1 .

inference.results_dir : The directory to save the inference results. Defaults to /results/inference . The output is a JSON file that contains the feature embeddings of all the test and query data. Multi-GPU inference is currently not supported for Re-Identification. The expected output would be as follows: [ { "img_path" : "/path/to/img1.jpg" , "embedding" : [ -0.30, 0 .12, 0 .13,... ] } , { "img_path" : "/path/to/img2.jpg" , "embedding" : [ -0.10, -0.06, -1.85,... ] } , ... { "img_path" : "/path/to/imgN.jpg" , "embedding" : [ 1 .41, 0 .63, -0.15,... ] } ]

Exporting the Model# Use the following command to export ReIdentificationNet to .onnx format for deployment: TAO Client (v2 API) TRAIN_JOB_ID = $( tao re_identification create-job \ --kind experiment \ --name "re_identification_export" \ --action export \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --specs " $EXPORT_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model re_identification export -e <experiment_spec> export.checkpoint = <tlt checkpoint to be exported> export.onnx_file = <path to exported file> [ export.gpu_id = <gpu index> ] Required Arguments The following arguments are required. -e, --experiment_spec : The experiment spec file to set up export.

export.checkpoint : The .pth model to be exported.

export.onnx_file : The path to save the exported model to. The default path is in the same directory as the \*.pth model. Optional Arguments The following arguments are optional to run the command. export.gpu_id : The index of the GPU that will be used to run the export. You can specify this value when the machine has multiple GPUs installed. Note that export can only run on a single GPU.