# LPRNet

LPRNet (License Plate Recognition Net) takes images as network input and predicts sequences of license plates characters.

## Preparing the Dataset

The dataset for LPRNet contains cropped license plates images and corresponding label files.

The data structure must be in the following format:

Copy
Copied!

/Dataset_01
/images
0000.jpg
0001.jpg
0002.jpg
...
...
...
N.jpg
/labels
0000.txt
0001.txt
0002.txt
...
...
...
N.txt
/characters_list.txt


Each cropped license plate image has a corresponding label text file that contains one line of characters in the specific license plate. There is a characters_list.txt which has all the characters found in license plate dataset. Each character occupies one line.

## Creating an Experiment Spec File

The spec file for LPRNet includes the random_seed, lpr_config, training_config, eval_config, augmentation_config, and dataset_config parameters. Here is an example for training on the NVIDIA license plate dataset:

Copy
Copied!

random_seed: 42
lpr_config {
hidden_units: 512
max_label_length: 8
arch: "baseline"
nlayers: 10
}
training_config {
batch_size_per_gpu: 32
num_epochs: 100
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 1e-6
max_learning_rate: 1e-4
soft_start: 0.001
annealing: 0.7
}
}
regularizer {
type: L2
weight: 5e-4
}
}
eval_config {
validation_period_during_training: 5
batch_size: 1
}
augmentation_config {
output_width: 96
output_height: 48
output_channel: 3
max_rotate_degree: 5
rotate_prob: 0.5
gaussian_kernel_size: 5
gaussian_kernel_size: 7
gaussian_kernel_size: 15
blur_prob: 0.5
reverse_color_prob: 0.5
keep_original_prob: 0.3
}
dataset_config {
data_sources: {
label_directory_path: "/path/to/train/labels"
image_directory_path: "/path/to/train/images"
}
characters_list_file: "/path/to/lp_characters"
validation_data_sources: {
label_directory_path: "/path/to/test/labels"
image_directory_path: "/path/to/test/images"
}
}


 Parameter Data Type Default Description random_seed Unsigned int 42 The random seed for the experiment lpr_config proto message – The configuration of the model architecture training_config proto message – The configuration of the training process eval_config proto message – The configuration of the evaluation process augmentation_config proto message – The configuration for data augmentation dataset_config proto message – The configuration for the dataset

### lpr_config

The lpr_config parameter provides options to change the LPRNet architecture.

Copy
Copied!

lpr_config {
hidden_units: 512
max_label_length: 8
arch: "baseline"
nlayers: 10
}


Parameter

Datatype

Default

Description

Supported Values

hidden_units

Unsigned int

512

The number of hidden units in the LSTM layers of LPRNet

max_label_length

Unsigned int

8

The maximum length of license plates in the dataset

arch

String

baseline

The architecture of LPRNet

baseline

nlayers

Unsigned int

10

The number of convolution layers in LPRNet

10, 18

### training_config

The training_config parameter defines the hyperparameters of the training process.

Copy
Copied!

training_config {
checkpoint_interval: 5
batch_size_per_gpu: 32
num_epochs: 100
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 1e-6
max_learning_rate: 1e-4
soft_start: 0.001
annealing: 0.7
}
}
regularizer {
type: L2
weight: 5e-4
}
}


 Parameter Datatype Default Description Supported Values batch_size_per_gpu int 32 The number of images per batch per GPU >1 num_epochs int 120 The total number of epochs used to run inthe experiment checkpoint_interval int 5 The interval at which the checkpoints are saved >0 learning rate learning rate scheduler proto soft_start _annealing _schedule The learning rate schedule for the trainer. Currently, LPRNet only supports the soft-start annealing learning rate schedule. It may be configured using the following parameters: * soft_start (float): The time to ramp up the learning rate from the minimum learning rate to the maximum learning rate * annealing (float): The time to cool down the learning rate from the maximum learning rate to the minimum learning rate * minimum_learning_rate (float): The minimum learning rate in the learning rate schedule. * maximum_learning_rate (float): The maximum learning rate in the learning rate schedule. annealing: 0.0-1.0 and greater than soft_start – Soft_start: 0.0 - 1.0 A sample lr plot for a soft start of 0.3 and annealing of 0.1 is shown in the figure below. regularizer regularizer proto config This parameter configures the type and the weight of the regularizer to be used during training. This config contains two parameters: * type: The type of regularizer being used * weight: The floating point weight of the regularizer The following are supported values for type are: * NO_REG * L1 * L2 visualizer Message type Training visualization config early_stopping Message type Early stopping config

#### visualizer

Visualization during training is configured using the visualizer parameter. The parameters are described in the table below.

 Parameter Datatype Default Description Supported Values enabled boolean false A Boolean flag to enable or disable this feature num_images int 3 The maximum number of images to be visualized in TensorBoard. >0

If visualization is enabled, the TensorBoard log will be produced during training, including the graphs for learning rate, training loss, and validation accuracy. The augmented images will also be produced in the TensorBoard.

#### early_stopping

The parameters for early stopping are described in the table below.

 Parameter Datatype Default Description Supported Value monitor string The metric to monitor in order to enable early stopping loss patience int 0 The number of checks for the monitor value before stopping the training min_delta float 0.0 The delta of the minimum value of monitor, below which it is regarded as not decreasing.

### eval_config

The eval_config parameter defines the hyperparameters of the evaluation process. The metric for evaluation is the license plate recognition accuracy. The recognition will be regarded as correct if all the characters in a license plated are classified correctly.

Copy
Copied!

eval_config {
validation_period_during_training: 5
batch_size: 1
}


 Parameter Datatype Default/Suggested value Description Supported Values validation_period_during_training int 5 The interval at which evaluation is run during training. The evaluation is run at this interval starting from the value of the first validation epoch parameter as specified below. 1 - total number of epochs batch_size int 1 The number of samples to do a single inference on >0

### augmentation_config

The augmentation_config parameter contains the hyperparameters for augmentation during training. It also defines the spatial size of the network input.

Copy
Copied!

augmentation_config {
output_width: 96
output_height: 48
output_channel: 3
max_rotate_degree: 5
rotate_prob: 0.5
gaussian_kernel_size: 5
gaussian_kernel_size: 7
gaussian_kernel_size: 15
blur_prob: 0.5
reverse_color_prob: 0.5
keep_original_prob: 0.3
}


 Parameter Datatype Default/Suggested value Description Supported Values output_width unsigned int 96 The width of preprocessed images. The width of network input >0 output_height unsigned int 48 The height of preprocessed images. The height of network input >0 output_channel unsigned int 3 The channel of preprocessed images 1, 3 keep_original_prob float 0.3 The probability for keeping original images. Only resized will be applied to am image with this probability 0.0 ~ 1.0 max_rotate_degree unsigned int 5 The maximum rotation angle for augmentation 0.0 ~ 90.0 rotate_prob float 0.5 The probability for rotating the image 0.0 ~ 1.0 gaussian_kernel_size unsigned int 5, 7, 15 The kernel size of the Gaussian blur >0 blur_prob float 0.5 The probability for blurring the image 0.0 ~ 1.0 reverse_color_prob float 0.5 The probability for reversing the color of the image 0.0 ~ 1.0

### dataset_config

The dataset_config parameter defines the path to training dataset, validation dataset, and characters list file.

Copy
Copied!

dataset_config {
data_sources: {
label_directory_path: "/path/to/train/labels"
image_directory_path: "/path/to/train/images"
}
characters_list_file: "/path/to/lp_characters"
validation_data_sources: {
label_directory_path: "/path/to/test/labels"
image_directory_path: "/path/to/test/images"
}
}


 Parameter Datatype Default/Suggested value Description Supported Values data_sources dataset proto The path to the training dataset images and labels: label_directory_path: The path to the label directory image_directory_path: The path to the image directory validation_data_sources dataset proto The path to the validation dataset images and labels characters_list_file string The path to the characters list file The characters in this file should be in unicode format.
Note

data_sources and validation_data_sources are both repeated fields. Multiple datasets can be added to sources.

## Training the Model

Use the following command to run LPRNet training:

Copy
Copied!

tao lprnet train -e <experiment_spec_file>
-r <results_dir>
-k <key>
[--gpus <num_gpus>]
[--gpu_index <gpu_index>]
[--use_amp]
[--log_file <log_file>]
[-m <resume_model_path>]
[--initial_epoch <initial_epoch>]


### Required Arguments

• -e, --experiment_spec_file: The path to the experiment spec file

• -r, --results_dir: The path to a folder where the experiment outputs should be written.

• -k, --key: The user-specific encoding key to save or load a .tlt model.

### Optional Arguments

• --gpus: The number of GPUs to be used in the training in a multi-GPU scenario (default: 1).

• --gpu_index: The GPU indices used to run the training. We can specify the GPU indices used to run training when the machine has multiple GPUs installed.

• --use_amp: A flag to enable AMP training.

• --log_file: The path to the log file. Defaults to stdout.

• -m, --resume_model_weights: The path to a pretrained model or a model to continue training.

• --initial_epoch: The epoch to start training.

Here’s an example of using the LPRNet training command:

Copy
Copied!



## Running Inference on the LPRNet Model

Use the following command to run inference on LPRNet with .tlt model or TensorRT engine:

Copy
Copied!

tao lprnet inference -m <model>
-i <in_image_path>
-e <experiment_spec>
[-k <key>]
[--gpu_index <gpu_index>]
[--log_file <log_file>]
[--trt]


### Required Arguments

• -m, --model: The .tlt model or TensorRT engine to do inference with

• -i, --in_image_path: The path to the license plate images to do inference with.

• -e, --experiment_spec: The experiment spec file to set up export. Can be the same as the training spec.

### Optional Arguments

• -h, --help: Show this help message and exit.

• -k, --key：The encoding key for the .tlt model.

• --gpu_index: The GPU index used to run the inference. We can specify the GPU index used to run inference when the machine has multiple GPUs installed. Note that inference can only run on a single GPU.

• --log_file: Path to the log file. Defaults to stdout.

• --trt: Run inference with the TRT engine.

Here’s an example of using the LPRNet inference command:

Copy
Copied!



## Deploying the Model

The deep learning and computer vision models that you trained can be deployed on edge devices, such as a Jetson Xavier, Jetson Nano, or Tesla, or in the cloud with NVIDIA GPUs.

DeepStream SDK is a streaming analytic toolkit to accelerate building AI-based video analytic applications. TAO Toolkit is integrated with DeepStream SDK, so models trained with TAO Toolkit will work out of the box with Deepstream.

Note

LPRNet .etlt cannot be parsed by DeepStream directly. You should use tao-converter to convert the .etlt model to optimized TensorRT engine and then integrate the engine into DeepStream pipeline.

### Using tao-converter

The tao-converter is a tool that is provided with the TAO Toolkit to facilitate the deployment of TAO Toolkit trained models on TensorRT and/or Deepstream. For deployment platforms with an x86 based CPU and discrete GPUs, the tao-converter is distributed within the TAO docker. Therefore, it is suggested to use the docker to generate the engine. However, this requires that the user adhere to the same minor version of TensorRT as distributed with the docker. The TAO docker includes TensorRT version 7.1. In order to use the engine with a different minor version of TensorRT, copy the converter from /opt/nvidia/tools/tao-converter to the target machine and follow the instructions for x86 to run it and generate a TensorRT engine.

For the aarch64 platform, the tao-converter is available to download in the dev zone.

Here is a sample command to generate LPRNet engine through tao-converter:

Copy
Copied!

tao-converter <etlt_model> -k <key_to_etlt_model> -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 -e <path_to_generated_trt_engine>


Through this command, an optimized TensorRT engine with dynamic input shape will be generated. (Dynamic shape of this engine: min_shape=[1x3x48x96], opt_shape=[4x3x48x96], max_shape=[16x3x48x96]. The shape format is NCHW.)

### Deploying the LPRNet in the DeepStream sample

Once you get the TensorRT engine of LPRNet, you could deploy it into DeepStream’s LPDR sample. This sample is a complete solution including car detection, license plate detection, and license plate recognition. A configurable CTC decoder is provided in this sample.

© Copyright 2022, NVIDIA.. Last updated on Dec 13, 2022.