The spec file for LPRNet includes the random_seed , lpr_config , training_config , eval_config , augmentation_config , and dataset_config parameters. Here is an example for training on the NVIDIA license plate dataset:

Copy Copied! random_seed: 42 lpr_config { hidden_units: 512 max_label_length: 8 arch: "baseline" nlayers: 10 } training_config { batch_size_per_gpu: 32 num_epochs: 100 learning_rate { soft_start_annealing_schedule { min_learning_rate: 1e-6 max_learning_rate: 1e-4 soft_start: 0.001 annealing: 0.7 } } regularizer { type: L2 weight: 5e-4 } } eval_config { validation_period_during_training: 5 batch_size: 1 } augmentation_config { output_width: 96 output_height: 48 output_channel: 3 max_rotate_degree: 5 rotate_prob: 0.5 gaussian_kernel_size: 5 gaussian_kernel_size: 7 gaussian_kernel_size: 15 blur_prob: 0.5 reverse_color_prob: 0.5 keep_original_prob: 0.3 } dataset_config { data_sources: { label_directory_path: "/path/to/train/labels" image_directory_path: "/path/to/train/images" } characters_list_file: "/path/to/lp_characters" validation_data_sources: { label_directory_path: "/path/to/test/labels" image_directory_path: "/path/to/test/images" } }

Parameter Data Type Default Description random_seed Unsigned int 42 The random seed for the experiment lpr_config proto message – The configuration of the model architecture training_config proto message – The configuration of the training process eval_config proto message – The configuration of the evaluation process augmentation_config proto message – The configuration for data augmentation dataset_config proto message – The configuration for the dataset

The lpr_config parameter provides options to change the LPRNet architecture.

Copy Copied! lpr_config { hidden_units: 512 max_label_length: 8 arch: "baseline" nlayers: 10 }

Parameter Datatype Default Description Supported Values hidden_units Unsigned int 512 The number of hidden units in the LSTM layers of LPRNet max_label_length Unsigned int 8 The maximum length of license plates in the dataset arch String baseline The architecture of LPRNet baseline nlayers Unsigned int 10 The number of convolution layers in LPRNet 10, 18

The training_config parameter defines the hyperparameters of the training process.

Copy Copied! training_config { checkpoint_interval: 5 batch_size_per_gpu: 32 num_epochs: 100 learning_rate { soft_start_annealing_schedule { min_learning_rate: 1e-6 max_learning_rate: 1e-4 soft_start: 0.001 annealing: 0.7 } } regularizer { type: L2 weight: 5e-4 } }

Parameter Datatype Default Description Supported Values batch_size_per_gpu int 32 The number of images per batch per GPU >1 num_epochs int 120 The total number of epochs used to run in the experiment checkpoint_interval int 5 The interval at which the checkpoints are saved >0 learning rate learning rate scheduler proto soft_start

_annealing

_schedule The learning rate schedule for the trainer. Currently,

LPRNet only supports the soft-start annealing learning rate schedule. It may be

configured using the following parameters: * soft_start (float): The time to ramp up the learning rate from the minimum learning rate to the maximum learning rate

* annealing (float): The time to cool down the learning rate from the maximum learning rate to the minimum learning rate

* minimum_learning_rate (float): The minimum learning rate in the learning rate schedule.

* maximum_learning_rate (float): The maximum learning rate in the learning rate schedule. annealing: 0.0-1.0 and greater than soft_start – Soft_start: 0.0 - 1.0 A sample lr plot for a soft start of 0.3 and annealing of 0.1 is shown

in the figure below. regularizer regularizer proto config This parameter configures the type and the weight of the regularizer to be used during training. This config contains two

parameters: * type : The type of regularizer being used

* weight : The floating point weight of the regularizer The following are supported values for type are: * NO_REG

* L1

* L2 visualizer Message type Training visualization config early_stopping Message type Early stopping config

visualizer

Visualization during training is configured using the visualizer parameter. The parameters are described in the table below.

Parameter Datatype Default Description Supported Values enabled boolean false A Boolean flag to enable or disable this feature num_images int 3 The maximum number of images to be visualized in TensorBoard. >0

If visualization is enabled, the TensorBoard log will be produced during training, including the graphs for learning rate, training loss, and validation accuracy. The augmented images will also be produced in the TensorBoard.

early_stopping

The parameters for early stopping are described in the table below.

Parameter Datatype Default Description Supported Value monitor string The metric to monitor in order to enable early stopping loss patience int 0 The number of checks for the monitor value before stopping the training min_delta float 0.0 The delta of the minimum value of monitor , below which it is regarded as not decreasing.

The eval_config parameter defines the hyperparameters of the evaluation process. The metric for evaluation is the license plate recognition accuracy. The recognition will be regarded as correct if all the characters in a license plated are classified correctly.

Copy Copied! eval_config { validation_period_during_training: 5 batch_size: 1 }

Parameter Datatype Default/Suggested value Description Supported Values validation_period_during_training int 5 The interval at which evaluation is run during training. The evaluation is run at this interval starting from the value of the first validation epoch parameter as specified below. 1 - total number of epochs batch_size int 1 The number of samples to do a single inference on >0

The augmentation_config parameter contains the hyperparameters for augmentation during training. It also defines the spatial size of the network input.

Copy Copied! augmentation_config { output_width: 96 output_height: 48 output_channel: 3 max_rotate_degree: 5 rotate_prob: 0.5 gaussian_kernel_size: 5 gaussian_kernel_size: 7 gaussian_kernel_size: 15 blur_prob: 0.5 reverse_color_prob: 0.5 keep_original_prob: 0.3 }

Parameter Datatype Default/Suggested value Description Supported Values output_width unsigned int 96 The width of preprocessed images. The width of network input >0 output_height unsigned int 48 The height of preprocessed images. The height of network input >0 output_channel unsigned int 3 The channel of preprocessed images 1, 3 keep_original_prob float 0.3 The probability for keeping original images. Only resized will be applied to am image with this probability 0.0 ~ 1.0 max_rotate_degree unsigned int 5 The maximum rotation angle for augmentation 0.0 ~ 90.0 rotate_prob float 0.5 The probability for rotating the image 0.0 ~ 1.0 gaussian_kernel_size unsigned int 5, 7, 15 The kernel size of the Gaussian blur >0 blur_prob float 0.5 The probability for blurring the image 0.0 ~ 1.0 reverse_color_prob float 0.5 The probability for reversing the color of the image 0.0 ~ 1.0

The dataset_config parameter defines the path to training dataset, validation dataset, and characters list file.

Copy Copied! dataset_config { data_sources: { label_directory_path: "/path/to/train/labels" image_directory_path: "/path/to/train/images" } characters_list_file: "/path/to/lp_characters" validation_data_sources: { label_directory_path: "/path/to/test/labels" image_directory_path: "/path/to/test/images" } }

Parameter Datatype Default/Suggested value Description Supported Values data_sources dataset proto The path to the training dataset images and labels: label_directory_path : The path to the label directory

image_directory_path : The path to the image directory validation_data_sources dataset proto The path to the validation dataset images and labels characters_list_file string The path to the characters list file The characters in this file should be in unicode format.

Note data_sources and validation_data_sources are both repeated fields. Multiple datasets can be added to sources.



