LPRNet#

LPRNet (License Plate Recognition Net) takes images as network input and predicts sequences of license plates characters.

Preparing the Dataset#

The dataset for LPRNet contains cropped license plates images and corresponding label files.

The data structure must be in the following format:

/Dataset_01
    /images
        0000.jpg
        0001.jpg
        0002.jpg
        ...
        ...
        ...
        N.jpg
    /labels
        0000.txt
        0001.txt
        0002.txt
        ...
        ...
        ...
        N.txt
/characters_list.txt

Each cropped license plate image has a corresponding label text file that contains one line of characters in the specific license plate. There is a characters_list.txt which has all the characters found in license plate dataset. Each character occupies one line.

Creating an Experiment Specification File#

The specification file for LPRNet includes the random_seed, lpr_config, training_config, eval_config, augmentation_config, and dataset_config parameters. Here is an example for training on the NVIDIA license plate dataset:

random_seed: 42
lpr_config {
  hidden_units: 512
  max_label_length: 8
  arch: "baseline"
  nlayers: 10
}
training_config {
  batch_size_per_gpu: 32
  num_epochs: 100
  learning_rate {
  soft_start_annealing_schedule {
    min_learning_rate: 1e-6
    max_learning_rate: 1e-4
    soft_start: 0.001
    annealing: 0.7
  }
  }
  regularizer {
    type: L2
    weight: 5e-4
  }
}
eval_config {
  validation_period_during_training: 5
  batch_size: 1
}
augmentation_config {
  output_width: 96
  output_height: 48
  output_channel: 3
  max_rotate_degree: 5
  rotate_prob: 0.5
  gaussian_kernel_size: 5
  gaussian_kernel_size: 7
  gaussian_kernel_size: 15
  blur_prob: 0.5
  reverse_color_prob: 0.5
  keep_original_prob: 0.3
}
dataset_config {
  data_sources: {
    label_directory_path: "/path/to/train/labels"
    image_directory_path: "/path/to/train/images"
  }
  characters_list_file: "/path/to/lp_characters"
  validation_data_sources: {
    label_directory_path: "/path/to/test/labels"
    image_directory_path: "/path/to/test/images"
  }
}

Parameter

Data Type

Default

Description

random_seed

Unsigned int

42

The random seed for the experiment

lpr_config

proto message

The configuration of the model architecture

training_config

proto message

The configuration of the training process

eval_config

proto message

The configuration of the evaluation process

augmentation_config

proto message

The configuration for data augmentation

dataset_config

proto message

The configuration for the dataset

lpr_config#

The lpr_config parameter provides options to change the LPRNet architecture.

lpr_config {
  hidden_units: 512
  max_label_length: 8
  arch: "baseline"
  nlayers: 10
}

Parameter

Datatype

Default

Description

Supported Values

hidden_units

Unsigned int

512

The number of hidden units in the LSTM layers of LPRNet

max_label_length

Unsigned int

8

The maximum length of license plates in the dataset

arch

String

baseline

The architecture of LPRNet

baseline

nlayers

Unsigned int

10

The number of convolution layers in LPRNet

10, 18

training_config#

The training_config parameter defines the hyperparameters of the training process.

training_config {
  checkpoint_interval: 5
  batch_size_per_gpu: 32
  num_epochs: 100
  learning_rate {
  soft_start_annealing_schedule {
    min_learning_rate: 1e-6
    max_learning_rate: 1e-4
    soft_start: 0.001
    annealing: 0.7
  }
  }
  regularizer {
    type: L2
    weight: 5e-4
  }
}

Parameter

Datatype

Default

Description

Supported Values

batch_size_per_gpu

int

32

The number of images per batch per GPU

>1

num_epochs

int

120

The total number of epochs used to run in the experiment

checkpoint_interval

int

5

The interval at which the checkpoints are saved

>0

learning rate







learning rate scheduler proto







soft_start
_annealing
_schedule





The learning rate schedule for the trainer. Currently,
LPRNet only supports the soft-start annealing learning rate schedule. It may be
configured using the following parameters:

* soft_start (float): The time to ramp up the learning rate from the minimum learning rate to the maximum learning rate
* annealing (float): The time to cool down the learning rate from the maximum learning rate to the minimum learning rate
* minimum_learning_rate (float): The minimum learning rate in the learning rate schedule.
* maximum_learning_rate (float): The maximum learning rate in the learning rate schedule.
annealing: 0.0-1.0 and greater than soft_start – Soft_start: 0.0 - 1.0

A sample lr plot for a soft start of 0.3 and annealing of 0.1 is shown
in the figure below.












regularizer




regularizer proto config









This parameter configures the type and the weight of the regularizer to be used during training. This config contains two
parameters:

* type: The type of regularizer being used
* weight: The floating point weight of the regularizer
The following are supported values for type are:

* NO_REG
* L1
* L2





visualizer

Message type

Training visualization config

early_stopping

Message type

Early stopping config

visualizer#

Visualization during training is configured using the visualizer parameter. The parameters are described in the table below.

Parameter

Datatype

Default

Description

Supported Values

enabled

boolean

false

A Boolean flag to enable or disable this feature

num_images

int

3

The maximum number of images to be visualized in TensorBoard.

>0

If visualization is enabled, the TensorBoard log will be produced during training, including the graphs for learning rate, training loss, and validation accuracy. The augmented images will also be produced in the TensorBoard.

early_stopping#

The parameters for early stopping are described in the table below.

Parameter

Datatype

Default

Description

Supported Value

monitor

string

The metric to monitor in order to enable early stopping

loss

patience

int

0

The number of checks for the monitor value before stopping the training

min_delta

float

0.0

The delta of the minimum value of monitor, below which it is regarded as not decreasing.

eval_config#

The eval_config parameter defines the hyperparameters of the evaluation process. The metric for evaluation is the license plate recognition accuracy. The recognition will be regarded as correct if all the characters in a license plated are classified correctly.

eval_config {
  validation_period_during_training: 5
  batch_size: 1
}

Parameter

Datatype

Default/Suggested value

Description

Supported Values

validation_period_during_training

int

5

The interval at which evaluation is run during training. The evaluation is run at this interval starting from the value of the first validation epoch parameter as specified below.

1 - total number of epochs

batch_size

int

1

The number of samples to do a single inference on

>0

augmentation_config#

The augmentation_config parameter contains the hyperparameters for augmentation during training. It also defines the spatial size of the network input.

augmentation_config {
  output_width: 96
  output_height: 48
  output_channel: 3
  max_rotate_degree: 5
  rotate_prob: 0.5
  gaussian_kernel_size: 5
  gaussian_kernel_size: 7
  gaussian_kernel_size: 15
  blur_prob: 0.5
  reverse_color_prob: 0.5
  keep_original_prob: 0.3
}

Parameter

Datatype

Default/Suggested value

Description

Supported Values

output_width

unsigned int

96

The width of preprocessed images. The width of network input

>0

output_height

unsigned int

48

The height of preprocessed images. The height of network input

>0

output_channel

unsigned int

3

The channel of preprocessed images

1, 3

keep_original_prob

float

0.3

The probability for keeping original images. Only resized will be applied to am image with this probability

0.0 ~ 1.0

max_rotate_degree

unsigned int

5

The maximum rotation angle for augmentation

0.0 ~ 90.0

rotate_prob

float

0.5

The probability for rotating the image

0.0 ~ 1.0

gaussian_kernel_size

unsigned int

5, 7, 15

The kernel size of the Gaussian blur

>0

blur_prob

float

0.5

The probability for blurring the image

0.0 ~ 1.0

reverse_color_prob

float

0.5

The probability for reversing the color of the image

0.0 ~ 1.0

dataset_config#

The dataset_config parameter defines the path to training dataset, validation dataset, and characters list file.

dataset_config {
  data_sources: {
    label_directory_path: "/path/to/train/labels"
    image_directory_path: "/path/to/train/images"
  }
  characters_list_file: "/path/to/lp_characters"
  validation_data_sources: {
    label_directory_path: "/path/to/test/labels"
    image_directory_path: "/path/to/test/images"
  }
}

Parameter

Datatype

Default/Suggested value

Description

Supported Values

data_sources

dataset proto

The path to the training dataset images and labels:


  • label_directory_path: The path to the label directory

  • image_directory_path: The path to the image directory

validation_data_sources

dataset proto

The path to the validation dataset images and labels

characters_list_file

string

The path to the characters list file

The characters in this file should be in unicode format.

Note

data_sources and validation_data_sources are both repeated fields. Multiple datasets can be added to sources.

Training the Model#

Training LPRNet uses the experiment specification file described above. Multi-GPU training is supported.

Evaluating the model#

The evaluation metric of LPRNet is recognition accuracy. A recognition is regarded as accurate if all the characters in the license plate are correct.

Running Inference on the LPRNet Model#

Inference may be run on LPRNet against a directory of license plate images using either a trained checkpoint or a TensorRT engine.

Exporting the Model#

LPRNet can be exported to ONNX format for deployment. The export workflow supports FP32 and FP16 precisions.

Deploying the Model#

The deep learning and computer vision models that you trained can be deployed on edge devices, such as a Jetson Xavier, Jetson Nano, or Tesla, or in the cloud with NVIDIA GPUs.

DeepStream SDK is a streaming analytic toolkit to accelerate building AI-based video analytic applications. TAO is integrated with DeepStream SDK, so models trained with TAO will work out of the box with Deepstream.

Deploying the LPRNet in the DeepStream sample#

Once you get the TensorRT engine of LPRNet, you could deploy it into DeepStream’s LPDR sample. This sample is a complete solution including car detection, license plate detection, and license plate recognition. A configurable CTC decoder is provided in this sample.