ReIdentificationNet

The ReidentificationNet models generate embeddings to identify objects captured in different scenes.

The model is essentially a ResNet50 backbone which takes in cropped images of objects as input produces feature embeddings as output.

../_images/reidentificationnet-architecture.PNG

The model is trained on the Market-1501 dataset with 751 annotated people and a sampled version of the MTMC people tracking dataset of the 2023 AI City Challenge with 156 annotated people. The dataset statistics are as follows:

subset	no. total identities	no. total images	no. total cameras	no. real identities	no. real images	no. real cameras	no. synthetic identities	no. synthetic images	no. synthetic cameras
Train	907	44070	135	759	14537	13	148	29533	122
Test	907	28768	135	759	21163	13	148	7605	122
Query	906	4356	135	758	3539	13	148	817	122

Model Card

The datasheet for the models is captured in the model card hosted at NGC, which includes the detailed instructions to deploy the models with DeepStream.

TAO Fine-Tuning

You may also retrain/fine-tune the ReIdentificationNet models on customized datasets. Refer to the TAO tutorial notebook and TAO documentation for more details.

Currently, there are 2 methods for fine-tuning ReIdentificationNet.

REST API Notebook - reid-model-finetuning-tao-api.ipynb
TAO Toolkit Notebook - reid-model-finetuning-tao-toolkit.ipynb

The Rest API notebook is available under metropolis-apps-standalone-deployment/notebooks/<version>_reference_apps/reid-model-finetuning-tao-api.ipynb whereas the TAO Toolkit Notebook can be found here.

TAO Toolkit is supported on discrete GPUs, such as H100, A100, A40, A30, A2, A16, A100x, A30x, V100, T4, Titan-RTX, and Quadro-RTX. Refer to the TAO toolkit documentation for more details on the recommended hardware requirements. The expected time to fine-tune the ReIdentificationNet is as follows:

Backbone Type	GPU Type	No. of training images	Image Size	No. of identities	Batch size	Total Epochs	Total Training Time
Resnet50	1 x Nvidia A100 - 80GB PCIE	13,000	256x128x3	751	128	120	~1.25 hours
Resnet50	1 x Nvidia Quadro GV100 - 32GB	13,000	256x128x3	751	64	120	~2.5 hours

Rest API Notebook Guide:

The documentation provided below accompanies the cells in the API notebook and offers guidance on how to execute them:

Environment Setup for Notebook:

conda create -n reid_finetuning python=3.11
conda activate reid_finetuning
conda install jupyterlab

Notebook Configuration cell: Once the environment setup is complete, the notebook will ask you to set the TAO API, NGC & synthetic data location variables. Access to TAO server can be obtained via self-hosting.
ReID Training Configuration & Train ReID cell: For fine-tuning, train_num_epochs is set to 10 epochs. It is essential that the user monitors the training process for overfitting/underfitting by evaluating the final checkpoint on test data. If the overfitting is observed, the user can generate more samples incrementally & resume the training process to validate good results on test data. TAO API also provides finer controls on the configuration that can be modified. For more details about TAO ReIdentificationNet configurations refer to these docs.
Evaluate Trained Model cell: Once the model is trained, mAP & rank-1 accuracy scores have to be evaluated to ensure training is performed correctly.
Export to ONNX cell: After successful evaluation, you can export the model to ONNX format for easy deployment by launching an export job. The job artifacts can then be downloaded to get the exported ONNX model file.

Accuracy

The goal of re-identification is to identify test samples of the same identities for each query.

The key performance indicators are the ranked accuracy of re-identification and the mean average precision (mAP).

Rank-K accuracy: It is method of computing accuracy where the top-K highest confidence labels are matched with a ground truth label. If the ground truth label falls in one of these top-K labels, we state that this prediction is accurate. It allows us to get an overall accuracy measurement while being lenient on the predictions if the number of classes are too high and too similar. In our case, we compute rank-1, 5 and 10 accuracies. This means in case of rank-10, for a given sample, if the top-10 highest confidence labels predicted, match the label of ground truth, this sample will be counted as a correct measurement.

Mean average precision: Precision measures how accurate predictions are, in our case the logits of ID of an object. In other words, it measures the percentage of the predictions that are correct. mAP is the average of average precision (AP) where AP is computed for each class, in our case ID.

The experimental results on the test set of Market-1501 are listed as follows.

feature dimension	mAP	rank-1 accuracy	rank-5 accuracy	rank-10 accuracy
64	91.0%	93.4%	96.7%	97.7%
128	92.1%	94.5%	96.9%	97.9%
256	93.0%	94.7%	97.3%	98.0%
512	93.4%	95.1%	97.5%	98.1%
1024	93.7%	94.8%	97.5%	98.2%
2048	93.9%	95.3%	98.0%	98.4%