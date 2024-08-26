The spec file for ReIdentificationNet Transformer includes model , dataset , re_ranking , and train parameters. The following is an example spec for training a Swin Tiny model on Market-1501 with 751 identities in the training set.

Copy Copied! results_dir: "/path/to/experiment_results" encryption_key: nvidia_tao model: backbone: swin_tiny_patch4_window7_224 last_stride: 1 pretrain_choice: self pretrained_model_path: "/path/to/pretrained_model.pth" input_channels: 3 input_width: 128 input_height: 384 neck: bnneck stride_size: [16, 16] feat_dim: 1024 no_margin: True neck_feat: after metric_loss_type: triplet with_center_loss: False with_flip_feature: False label_smooth: False pretrain_hw_ratio: 2 dataset: train_dataset_dir: "/path/to/train_dataset_dir" test_dataset_dir: "/path/to/test_dataset_dir" query_dataset_dir: "/path/to/query_dataset_dir" num_classes: 751 batch_size: 64 val_batch_size: 128 num_workers: 8 pixel_mean: [0.5, 0.5, 0.5] pixel_std: [0.5, 0.5, 0.5] padding: 10 prob: 0.5 re_prob: 0.5 sampler: softmax_triplet num_instances: 4 re_ranking: re_ranking: True k1: 20 k2: 6 lambda_value: 0.3 train: results_dir: "${results_dir}/train" optim: name: SGD lr_steps: [40, 70] gamma: 0.1 bias_lr_factor: 2 weight_decay: 0.0001 weight_decay_bias: 0.0001 warmup_factor: 0.01 warmup_epochs: 20 warmup_method: cosine base_lr: 0.0008 momentum: 0.9 center_loss_weight: 0.0005 center_lr: 0.5 triplet_loss_margin: 0.3 large_fc_lr: False num_epochs: 120 checkpoint_interval: 10

Parameter Data Type Default Description model dict config – The configuration for the model architecture train dict config – The configuration for the training process dataset dict config – The configuration for the dataset re_ranking dict config – The configuration for the re-ranking module

The model parameter provides options to change the ReIdentificationNet Transformer architecture.

Copy Copied! model: backbone: swin_tiny_patch4_window7_224 last_stride: 1 pretrain_choice: self pretrained_model_path: "/path/to/pretrained_model.pth" input_channels: 3 input_width: 128 input_height: 384 neck: bnneck stride_size: [16, 16] feat_dim: 1024 no_margin: True neck_feat: after metric_loss_type: triplet with_center_loss: False with_flip_feature: False label_smooth: False pretrain_hw_ratio: 2

Parameter Datatype Default Description Supported Values backbone string swin_tiny_patch4_window7_224 The type of model, which can be Swin-based architectures or resnet_50 (please refer to ReIdentificationNet) resnet_50/swin_base_patch4_window7_224/swin_small_patch4_window7_224/swin_tiny_patch4_window7_224 last_stride unsigned int 1 The number of strides during convolution >0 pretrain_choice string self Specifies the pre-trained network self/imagenet/”” pretrained_model_path string The path to the pre-trained model input_channels unsigned int 3 The number of input channels >0 input_width int 128 The width of the input images >0 input_height int 384 The height of the input images >0 neck string bnneck Specifies whether to train with BNNeck bnneck/”” feat_dim unsigned int 1024 The output size of the feature embeddings >0 no_margin bool True A flag specifying whether to train with soft triplet loss True/False neck_feat string after Specifies which feature of BNNeck to use for testing before/after metric_loss_type string triplet The type of metric loss triplet/center/triplet_center with_center_loss bool False A flag specifying whether to enable center loss True/False with_flip_feature bool False A flag specifying whether to enable image flipping True/False label_smooth bool False A flag specifying whether to enable label smoothing True/False pretrain_hw_ratio float 2 The height-width ratio of the pre-trained model >0

The dataset parameter defines the dataset source, training batch size, and augmentation.

Copy Copied! dataset: train_dataset_dir: "/path/to/train_dataset_dir" test_dataset_dir: "/path/to/test_dataset_dir" query_dataset_dir: "/path/to/query_dataset_dir" num_classes: 751 batch_size: 64 val_batch_size: 128 num_workers: 8 pixel_mean: [0.5, 0.5, 0.5] pixel_std: [0.5, 0.5, 0.5] padding: 10 prob: 0.5 re_prob: 0.5 sampler: softmax_triplet num_instances: 4

Parameter Datatype Default Description Supported Values train_dataset_dir string The path to the train images test_dataset_dir string The path to the test images query_dataset_dir string The path to the query images num_classes unsigned int 751 The number of unique person IDs >0 batch_size unsigned int 64 The batch size for training >0 val_batch_size unsigned int 128 The batch size for validation >0 num_workers unsigned int 8 The number of parallel workers processing data >0 pixel_mean float list [0.5, 0.5, 0.5] The pixel mean for image normalization float list pixel_std float list [0.5, 0.5, 0.5] The pixel standard deviation for image normalization float list padding unsigned int 10 The pixel padding size around images for image augmentation >=1 prob float 0.5 The random horizontal flipping probability for image augmentation >0 re_prob float 0.5 The random erasing probability for image augmentation >0 sampler string softmax_triplet The type of sampler for data loading softmax/triplet/softmax_triplet num_instances unsigned int 4 The number of image instances of the same person in a batch >0

The re_ranking parameter defines the settings for the re-ranking module.

Copy Copied! re_ranking: re_ranking: True k1: 20 k2: 6 lambda_value: 0.3

Parameter Datatype Default Description Supported Values re_ranking bool True A flag that enables the re-ranking module True/False k1 unsigned int 20 The k used for k-reciprocal nearest neighbors >0 k2 unsigned int 6 The k used for local query expansion >0 lambda_value float 0.3 The weight of the original distance in combination with the Jaccard distance >0.0

The train parameter defines the hyperparameters of the training process.

Copy Copied! train: optim: name: SGD lr_steps: [40, 70] gamma: 0.1 bias_lr_factor: 2 weight_decay: 0.0001 weight_decay_bias: 0.0001 warmup_factor: 0.01 warmup_epochs: 20 warmup_method: cosine base_lr: 0.0008 momentum: 0.9 center_loss_weight: 0.0005 center_lr: 0.5 triplet_loss_margin: 0.3 large_fc_lr: False num_epochs: 120 checkpoint_interval: 10

Parameter Datatype Default Description Supported Values optim dict config The configuration for the SGD optimizer, including the learning rate, learning scheduler, weight decay, etc. num_epochs unsigned int 120 The total number of epochs to run the experiment >0 checkpoint_interval unsigned int 10 The interval at which the checkpoints are saved >0 clip_grad_norm float 0.0 The amount to clip the gradient by the L2 norm. A value of 0.0 specifies no clipping. >=0

optim

The optim parameter defines the config for the SGD optimizer in training, including the learning rate, learning scheduler, and weight decay.

Copy Copied! optim: name: SGD lr_steps: [40, 70] gamma: 0.1 bias_lr_factor: 2 weight_decay: 0.0001 weight_decay_bias: 0.0001 warmup_factor: 0.01 warmup_epochs: 20 warmup_method: cosine base_lr: 0.0008 momentum: 0.9 center_loss_weight: 0.0005 center_lr: 0.5 triplet_loss_margin: 0.3 large_fc_lr: False