ReIdentificationNet

ReIdentificationNet takes cropped images of a person from different perspectives as network input and outputs the embedding features for that person. The embeddings are used to perform similarity matching to re-identify the same person. The model supported in the current version is based on ResNet, which is the most commonly used baseline for re-identification due to its high accuracy.

ReIdentificationNet requires cropped images as input. These images are resized to 256x128 for model input. Random transformation is applied to each image during training.

The data should be organized in the following structure:

Copy
Copied!
            

/Dataset /bounding_box_train 0002_c1s1_000451_03.jpg 0002_c1s1_000551_01.jpg ... 1500_c6s3_086567_01.jpg /bounding_box_test 0000_c1s1_000151_01.jpg 0000_c1s1_000376_03.jpg ... 1501_c6s4_001902_01.jpg /query 0001_c1s1_001051_00.jpg 0001_c2s1_000301_00.jpg ... 1501_c6s4_001877_00.jpg

The root directory of the dataset contains sub-directories for training, testing, and query. Each sub-directory has the cropped images of different identities. For example, the image 0001_c1s1_01_00.jpg is from the first sequence s1 of camera c1. 01 indicates the first frame in the sequence c1s1. 0001 is the unique ID assigned to the object. The contents after the third _ are ignored.

The spec file for ReIdentificationNet includes model_config, train_config, dataset_config, and re_ranking_config parameters. Here is an example spec for training a ResNet model on Market-1501 that contains 751 identities in the training set.

Copy
Copied!
            

model_config: backbone: resnet50 last_stride: 1 pretrain_choice: imagenet pretrained_model_path: "/path/to/pretrained_model.pth" input_channels: 3 input_size: [256, 128] neck: bnneck feat_dim: 256 num_classes: 751 neck_feat: after metric_loss_type: triplet with_center_loss: False with_flip_feature: False label_smooth: True train_config: optim: name: Adam lr_monitor: str = "val_loss" steps: [40, 70] gamma: 0.1 bias_lr_factor: 1 weight_decay: 0.0005 weight_decay_bias: 0.0005 warmup_factor: 0.01 warmup_iters: 10 warmup_method: linear base_lr: 0.00035 momentum: 0.9 center_loss_weight: 0.0005 center_lr: 0.5 triplet_loss_margin: 0.3 epochs: 120 checkpoint_interval: 10 dataset_config: train_dataset_dir: "/path/to/train_dataset_dir" val_dataset_dir: "/path/to/test_dataset_dir" query_dataset_dir: "/path/to/query_dataset_dir" batch_size: 64 val_batch_size: 128 workers: 8 pixel_mean: [0.485, 0.456, 0.406] pixel_std: [0.226, 0.226, 0.226] padding: 10 prob: 0.5 re_prob: 0.5 sampler: softmax_triplet num_instance: 4 re_ranking_config: re_ranking: True k1: 20 k2: 6 lambda_value: 0.3

Parameter

Data Type

Default

Description

model_config

dict config

The configuration for the model architecture

train_config

dict config

The configuration for the training process

dataset_config

dict config

The configuration for the dataset

re_ranking_config

dict config

The configuration for the re-ranking module

model_config

The model_config parameter provides options to change the ReIdentificationNet architecture.

Copy
Copied!
            

model_config: backbone: resnet50 last_stride: 1 pretrain_choice: imagenet pretrained_model_path: "/path/to/pretrained_model.pth" input_channels: 3 input_size: [256, 128] neck: bnneck feat_dim: 256 num_classes: 751 neck_feat: after metric_loss_type: triplet with_center_loss: False with_flip_feature: False label_smooth: True

Parameter

Datatype

Default

Description

Supported Values

backbone

string

resnet50

The type of model, which can currently only be resnet50

resnet50

last_stride

unsigned int

1

The number of strides during convolution

>0

pretrain_choice

string

imagenet

Specifies the pretrained network

imagenet/””

pretrained_model_path

string

The path to the pretrained model

input_channels

unsigned int

3

The number of input channels

>0

input_size

int list

[256, 128]

The input size of the images

int list

neck

string

bnneck

Specifies whether to train with BNNeck

bnneck/””

feat_dim

unsigned int

256

The output size of the feature embeddings

>0

num_classes

unsigned int

751

The number of unique person IDs

>0

neck_feat

string

after

Specifies which feature of BNNeck to use for testing

before/after

metric_loss_type

string

triplet

The type of metric loss

triplet/center/triplet_center

with_center_loss

bool

False

Specifies whether to enable center loss

True/False

with_flip_feature

bool

False

Specifies whether to enable image flipping

True/False

label_smooth

bool

True

Specifies whether to enable label smoothing

True/False

train_config

The train_config parameter defines the hyperparameters of the training process.

Copy
Copied!
            

train_config: optim: name: Adam lr_monitor: str = "val_loss" steps: [40, 70] gamma: 0.1 bias_lr_factor: 1 weight_decay: 0.0005 weight_decay_bias: 0.0005 warmup_factor: 0.01 warmup_iters: 10 warmup_method: linear base_lr: 0.00035 momentum: 0.9 center_loss_weight: 0.0005 center_lr: 0.5 triplet_loss_margin: 0.3 epochs: 120 checkpoint_interval: 10

Parameter

Datatype

Default

Description

Supported Values

optim

dict config

The configuration for the SGD optimizer, including the learning rate, learning scheduler, weight decay, etc.

epochs

unsigned int

120

The total number of epochs to run the experiment

>0

checkpoint_interval

unsigned int

10

The interval at which the checkpoints are saved

>0

clip_grad_norm

float

0.0

The amount to clip the gradient by the L2 norm. A value of 0.0 specifies no clipping.

>=0

optim

The optim parameter defines the config for the SGD optimizer in training, including the learning rate, learning scheduler, and weight decay.

Copy
Copied!
            

optim: name: Adam lr_monitor: str = "val_loss" steps: [40, 70] gamma: 0.1 bias_lr_factor: 1 weight_decay: 0.0005 weight_decay_bias: 0.0005 warmup_factor: 0.01 warmup_iters: 10 warmup_method: linear base_lr: 0.00035 momentum: 0.9 center_loss_weight: 0.0005 center_lr: 0.5 triplet_loss_margin: 0.3

Parameter

Datatype

Default

Description

Supported Values

name

string

Adam

The name of the optimizer

Adam/SGD/Adamax/…

lr_monitor

string

val_loss

The monitor value for the AutoReduce scheduler

val_loss/train_loss

steps

int list

[40, 70]

The steps to decrease the learning rate for the MultiStep scheduler

int list

gamma

float

0.1

The decay rate for the WarmupMultiStepLR

>0.0

bias_lr_factor

float

1

The bias learning rate factor for the WarmupMultiStepLR

>=1

weight_decay

float

0.0005

The weight decay coefficient for the optimizer

>0.0

weight_decay_bias

float

0.0005

The weight decay bias for the optimizer

>0.0

warmup_factor

float

0.01

The warmup factor for the WarmupMultiStepLR scheduler

>0.0

warmup_iters

unsigned int

10

The number of warmup iterations for the WarmupMultiStepLR scheduler

>0

warmup_method

string

linear

The warmup method for the optimizer

constant/linear

base_lr

float

0.00035

The initial learning rate for the training

>0.0

momentum

float

0.9

The momentum for the WarmupMultiStepLR optimizer

>0.0

center_loss_weight

float

0.0005

The balanced weight of center loss

>0.0

center_lr

float

0.5

The learning rate of SGD to learn the centers of center loss

>0.0

triplet_loss_margin

float

0.3

The margin value for triplet loss

>0.0

dataset_config

The dataset_config parameter defines the dataset source, training batch size, and augmentation.

Copy
Copied!
            

dataset_config: train_dataset_dir: "/path/to/train_dataset_dir" val_dataset_dir: "/path/to/test_dataset_dir" query_dataset_dir: "/path/to/query_dataset_dir" batch_size: 64 val_batch_size: 128 workers: 8 pixel_mean: [0.485, 0.456, 0.406] pixel_std: [0.226, 0.226, 0.226] padding: 10 prob: 0.5 re_prob: 0.5 sampler: softmax_triplet num_instance: 4

Parameter

Datatype

Default

Description

Supported Values

train_dataset_dir

string

The path to the train images

test_dataset_dir

string

The path to the test images

query_dataset_dir

string

The path to the query images

batch_size

unsigned int

64

The batch size for training

>0

val_batch_size

unsigned int

128

The batch size for validation

>0

workers

unsigned int

8

The number of parallel workers processing data

>0

pixel_mean

float list

[0.485, 0.456, 0.406]

The pixel mean for image normalization

float list

pixel_std

float list

[0.226, 0.226, 0.226]

The pixel standard deviation for image normalization

float list

padding

unsigned int

10

The pixel padding size around images for image augmentation

>=1

prob

float

0.5

The random horizontal flipping probability for the image augmentation

>0

re_prob

float

0.5

The random erasing probability for image augmentation

>0

sampler

string

softmax_triplet

The type of sampler for data loading

softmax/triplet/softmax_triplet

num_instance

unsigned int

4

The number of image instances of the same person in a batch

>0

re_ranking_config

The re_ranking_config parameter defines the setting for the re-ranking module.

Copy
Copied!
            

re_ranking_config: re_ranking: True k1: 20 k2: 6 lambda_value: 0.3

Parameter

Datatype

Default

Description

Supported Values

re_ranking

bool

True

A flag that enables the re-ranking module should be enabled

True/False

k1

unsigned int

20

The k used for k-reciprocal nearest neighbors

>0

k2

unsigned int

6

The k used for local query expansion

>0

lambda_value

float

0.3

The weight of original distance in the combination with Jaccard distance

>0.0

Use the following command to run ReIdentificationNet training:

Copy
Copied!
            

tao re_identification train -e <experiment_spec_file> -r <results_dir> -k <key> [gpu_ids=<gpu id list>] [resume_training_checkpoint_path=<absolute path to \*.tlt checkpoint>]

Required Arguments

  • -e, --experiment_spec_file: The path to the experiment spec file.

  • -r, --results_dir: The path to a folder where the experiment outputs should be written.

  • -k, --key: The user-specific encoding key to save or load a .tlt model.

Optional Arguments

  • gpu_ids: The GPU indices list for training. If you set more than one GPU ID, multi-GPU training will be triggered automatically.

  • resume_training_checkpoint_path: The path to a checkpoint to continue training.

Here’s an example of using the ReIdentificationNet training command:

Copy
Copied!
            

tao re_identification train -e $DEFAULT_SPEC -r $RESULTS_DIR -k $YOUR_KEY


The evaluation metric of ReIdentificationNet is the mean average precision and ranked accuracy. The plots of sampled matches and the cumulative matching characteristic (CMC) curve can be obtained using the output_sampled_matches_plot and output_cmc_curve_plot parameters, respectively.

Use the following command to run ReIdentificationNet evaluation:

Copy
Copied!
            

tao re_identification evaluate -e <experiment_spec_file> -k <key> model=<model to be evaluated> test_dataset=<path to test data> query_dataset=<path to query data> output_sampled_matches_plot=<path to the output sampled matches plot> output_cmc_curve_plot=<path to the output CMC curve plot> [gpu_id=<gpu index>]

Required Arguments

  • -e, --experiment_spec_file: The experiment spec file to set up the evaluation experiment

  • -k, --key: The encoding key for the .tlt model

  • model: The .tlt model

  • test_dataset: The path to the test data

  • query_dataset: The path to the query data

  • output_sampled_matches_plot: The path to the plotted file of sampled matches

  • output_cmc_curve_plot: The path to the plotted file of the CMC curve

Optional Argument

  • gpu_id: The GPU index used to run the evaluation. You can specify the GPU index used to run evaluation when the machine has multiple GPUs installed. Note that evaluation can only run on a single GPU.

Here’s an example of using the ReIdentificationNet evaluation command:

Copy
Copied!
            

tao re_identification evaluate -e $DEFAULT_SPEC -k $YOUR_KEY model=$TRAINED_TLT_MODEL test_dataset=$TEST_DATA query_dataset=$QUERY_DATA output_sampled_matches_plot=$OUTPUT_SAMPLED_MATCHED_PLOT output_cmc_curve_plot=$OUTPUT_CMC_CURVE_PLOT


Use the following command to run inference on ReIdentificationNet with the .tlt model.

Copy
Copied!
            

tao re_identification inference -e <experiment_spec> -k <key> model=<inference model> test_dataset=<path to gallery data> query_dataset=<path to query data> output_file=<path to output file> [gpu_id=<gpu index>]

The output will be a JSON file that contains the feature embeddings of all the test and query data.

Required Arguments

  • -e, --experiment_spec: The experiment spec file to set up inference

  • -k, --key: The encoding key for the .tlt model

  • model: The .tlt model to perform inference with

  • test_dataset: The path to the test data

  • query_dataset: The path to the query data

  • output_file: The path to the output JSON file

Optional Argument

  • gpu_id: The index of the GPU that will be used to run inference. You can specify this value when the machine has multiple GPUs installed. Note that inference can only run on a single GPU.

Here’s an example of using the ReIdentificationNet inference command:

Copy
Copied!
            

tao re_identification inference -e $DEFAULT_SPEC -k $KEY model=$TRAINED_TLT_MODEL test_dataset=$TEST_DATA query_dataset=$QUERY_DATA output_file=$OUTPUT_FILE

The expected output would be as follows:

Copy
Copied!
            

[ { "img_path": "/path/to/img1.jpg", "embedding": [-0.30, 0.12, 0.13,...] }, { "img_path": "/path/to/img2.jpg", "embedding": [-0.10, -0.06, -1.85,...] }, ... { "img_path": "/path/to/imgN.jpg", "embedding": [1.41, 0.63, -0.15,...] } ]


Use the following command to export ReIdentificationNet to .etlt format for deployment:

Copy
Copied!
            

tao re_identification export -k <key> -e <experiment_spec> model=<tlt checkpoint to be exported> [gpu_id=<gpu index>] [output_file=<path to exported file>]

Required Arguments

  • -e, --experiment_spec: The experiment spec file to set up export

  • -k, --key: The encoding key for the .tlt model

  • model: The .tlt model to be exported

Optional Arguments

  • gpu_id: The index of the GPU that will be used to run the export. You can specify this value when the machine has multiple GPUs installed. Note that export can only run on a single GPU.

  • output_file: The path to save the exported model to. The default path is in the same directory as the \*.tlt model.

Here’s an example of using the ReIdentificationNet export command:

Copy
Copied!
            

tao re_identification export -e $DEFAULT_SPEC -k $YOUR_KEY model=$TRAINED_TLT_MODEL


You can deploy the trained deep -earning and computer-vision models on edge devices–such as a Jetson Xavier, Jetson Nano, or Tesla–or in the cloud with NVIDIA GPUs. The exported \*.etlt model can also be used with TAO Toolkit Triton Apps.

Running ReIdentificationNet Inference on the Triton Sample

The TAO Toolkit Triton Apps provide an inference sample for ReIdentificationNet. It consumes a TensorRT engine and supports running with a directory of query (probe) images and a directory of test (gallery) images containing the same identities.

To use this sample, you need to generate the TensorRT engine from an \*.etlt model using tao-converter.

Generating TensorRT Engine Using tao-converter

The tao-converter tool is provided with the TAO Toolkit to facilitate the deployment of TAO trained models on TensorRT and/or Deepstream. This section elaborates on how to generate a TensorRT engine using tao-converter.

For deployment platforms with an x86-based CPU and discrete GPUs, the tao-converter is distributed within the TAO docker. Therefore, we suggest using the docker to generate the engine. However, this requires that the user adhere to the same minor version of TensorRT as distributed with the docker. The TAO docker includes TensorRT version 8.0.

Instructions for x86

For an x86 platform with discrete GPUs, the default TAO package includes the tao-converter built for TensorRT 8.2.5.1 with CUDA 11.4 and CUDNN 8.2. However, for any other version of CUDA and TensorRT, please refer to the overview section for download. Once the tao-converter is downloaded, follow the instructions below to generate a TensorRT engine.

  1. Unzip the zip file on the target machine.

  2. Install the OpenSSL package using the command:

    Copy
    Copied!
                

    sudo apt-get install libssl-dev

  3. Export the following environment variables:

Copy
Copied!
            

$ export TRT_LIB_PATH=”/usr/lib/x86_64-linux-gnu” $ export TRT_INC_PATH=”/usr/include/x86_64-linux-gnu”

  1. Run the tao-converter using the sample command below and generate the engine.

  2. Instructions to build TensorRT OSS on Jetson can be found in the TensorRT OSS on x86 section above or in this GitHub repo.

Note

Make sure to follow the output node names as mentioned in the Exporting the Model section of the respective model.


Instructions for Jetson

For the Jetson platform, the tao-converter is available to download in the NVIDIA developer zone. You may choose the version you wish to download as listed in the overview section. Once the tao-converter is downloaded, please follow the instructions below to generate a TensorRT engine.

  1. Unzip the zip file on the target machine.

  2. Install the OpenSSL package using the command:

    Copy
    Copied!
                

    sudo apt-get install libssl-dev

  3. Export the following environment variables:

Copy
Copied!
            

$ export TRT_LIB_PATH=”/usr/lib/aarch64-linux-gnu” $ export TRT_INC_PATH=”/usr/include/aarch64-linux-gnu”

  1. For Jetson devices, TensorRT comes pre-installed with Jetpack. If you are using older JetPack, upgrade to JetPack-5.0DP.

  2. Instructions to build TensorRT OSS on Jetson can be found in the TensorRT OSS on Jetson (ARM64) section above or in this GitHub repo.

  3. Run the tao-converter using the sample command below and generate the engine.

Note

Make sure to follow the output node names as mentioned in Exporting the Model section of the respective model.


Using the tao-converter

Here is a sample command to generate the ReIdentificationNet engine through tao-converter:

Copy
Copied!
            

#convert ResNet50 model, input image of width 128 and height 256: tao-converter <etlt_model> \ -k <key_to_etlt_model> \ -d 3,256,128 \ -p input,1x3x256x128,4x3x256x128,16x3x256x128 \ -o fc_pred \ -t fp16 \ -m 16 \ -e <path_to_generated_trt_engine>

This command will generate an optimized TensorRT engine.

Running the Triton Inference Sample

You can generate the TensorRT engine when starting the Triton server using the following command:

Copy
Copied!
            

bash scripts/start_server.sh

When the server is running, you can get results from a directory of query images and a directory of test images using the following command with a client:

Copy
Copied!
            

python tao_client.py <path_to_query_directory> \ --test_dir <path_to_test_directory> -m re_identification_tao \ -x 1 \ -b 16 \ --mode Re_identification \ -i https \ -u localhost:8000 \ --async \ --output_path <path_to_output_directory>

Note

The server will perform inference on the input image directories. The results are saved as a JSON file. The following is a sample of the JSON output:

Copy
Copied!
            

[ ..., { "img_path": "/localhome/Data/market1501/query/1121_c3s2_156744_00.jpg", "embedding": [-1.1530249118804932, -1.8521332740783691,..., 0.380886435508728] },... { "img_path": "/localhome/Data/market1501/bounding_box_test/1377_c2s3_038007_05.jpg", "embedding": [0.09496910870075226, 0.26107653975486755,..., 0.2835155725479126] },... ]


End-to-End Inference Using Triton

The TAO Toolkit Triton Apps provides a sample for end-to-end inference from a directory of query images and a directory of test images. The sample downloads the Market-1501 dataset and randomly samples a subset of 100 identities. The client implicitly converts the image samples into arrays and sends them to the Triton server. The feature embedding for each image is returned and saved to the JSON output. An image of sampled matches and a figure of the CMC curve is also generated for visualization.

You can start the Triton server using the following command (only the ReIdentificationNet model will be downloaded and converted into a TensorRT engine):

Copy
Copied!
            

bash scripts/re_id_e2e_inference/start_server.sh

Once the Triton server has started, open another terminal and use the following command to run re-identification on the query and test images using the Triton server instance that you have previously spun up:

Copy
Copied!
            

bash scripts/re_id_e2e_inference/start_client.sh

© Copyright 2022, NVIDIA.. Last updated on Mar 23, 2023.