LPRNet ================== .. _LPRNet: LPRNet (License Plate Recognition Net) takes images as network input and predicts sequences of license plates characters. Preparing the Dataset --------------------------- The dataset for LPRNet contains cropped license plates images and corresponding label files. The data structure must be in the following format: .. code:: /Dataset_01 /images 0000.jpg 0001.jpg 0002.jpg ... ... ... N.jpg /labels 0000.txt 0001.txt 0002.txt ... ... ... N.txt /characters_list.txt Each cropped license plate image has a corresponding label text file that contains one line of characters in the specific license plate. There is a `characters_list.txt` which has all the characters found in license plate dataset. Each character occupies one line. Creating an Experiment Spec File -------------------------------- The spec file for LPRNet includes the :code:`random_seed`, :code:`lpr_config`, :code:`training_config`, :code:`eval_config`, :code:`augmentation_config`, and :code:`dataset_config` parameters. Here is an example for training on the NVIDIA license plate dataset: .. code:: random_seed: 42 lpr_config { hidden_units: 512 max_label_length: 8 arch: "baseline" nlayers: 10 } training_config { batch_size_per_gpu: 32 num_epochs: 100 learning_rate { soft_start_annealing_schedule { min_learning_rate: 1e-6 max_learning_rate: 1e-4 soft_start: 0.001 annealing: 0.7 } } regularizer { type: L2 weight: 5e-4 } } eval_config { validation_period_during_training: 5 batch_size: 1 } augmentation_config { output_width: 96 output_height: 48 output_channel: 3 max_rotate_degree: 5 rotate_prob: 0.5 gaussian_kernel_size: 5 gaussian_kernel_size: 7 gaussian_kernel_size: 15 blur_prob: 0.5 reverse_color_prob: 0.5 keep_original_prob: 0.3 } dataset_config { data_sources: { label_directory_path: "/path/to/train/labels" image_directory_path: "/path/to/train/images" } characters_list_file: "/path/to/lp_characters" validation_data_sources: { label_directory_path: "/path/to/test/labels" image_directory_path: "/path/to/test/images" } } +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ | **Parameter** | **Data Type** | **Default** | **Description** | +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ | :code:`random_seed` | Unsigned int | 42 | The random seed for the experiment | +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ | :code:`lpr_config` | proto message | -- | The configuration of the model architecture | +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ | :code:`training_config` | proto message | -- | The configuration of the training process | +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ | :code:`eval_config` | proto message | -- | The configuration of the evaluation process | +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ | :code:`augmentation_config` | proto message | -- | The configuration for data augmentation | +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ | :code:`dataset_config` | proto message | -- | The configuration for the dataset | +-----------------------------+-----------------+-------------------+--------------------------------------------------------------------------------------------------------------+ lpr_config ^^^^^^^^^^ The :code:`lpr_config` parameter provides options to change the LPRNet architecture. .. code:: lpr_config { hidden_units: 512 max_label_length: 8 arch: "baseline" nlayers: 10 } +-------------------------+------------------+-------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------+ | **Parameter** | **Datatype** | **Default** | **Description** | **Supported Values** | +=========================+==================+=============+==========================================================================================================================+==============================+ | :code:`hidden_units` | Unsigned int | 512 | The number of hidden units in the LSTM layers of LPRNet | | +-------------------------+------------------+-------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------+ | :code:`max_label_length`| Unsigned int | 8 | The maximum length of license plates in the dataset | | +-------------------------+------------------+-------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------+ | :code:`arch` | String | baseline | The architecture of LPRNet | baseline | +-------------------------+------------------+-------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------+ | :code:`nlayers` | Unsigned int | 10 | The number of convolution layers in LPRNet | 10, 18 | +-------------------------+------------------+-------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------+ training_config ^^^^^^^^^^^^^^^ The :code:`training_config` parameter defines the hyperparameters of the training process. .. code:: training_config { batch_size_per_gpu: 32 num_epochs: 100 learning_rate { soft_start_annealing_schedule { min_learning_rate: 1e-6 max_learning_rate: 1e-4 soft_start: 0.001 annealing: 0.7 } } regularizer { type: L2 weight: 5e-4 } } +---------------------------+-------------------------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+ | **Parameter** | **Datatype** | **Default** | **Description** | **Supported Values** | +---------------------------+-------------------------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+ | :code:`batch_size_per_gpu`| int | 32 | The number of images per batch per GPU | >1 | +---------------------------+-------------------------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+ | :code:`num_epochs` | int | 120 | The total number of epochs to run the experiment | | +---------------------------+-------------------------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+ | :code:`learning rate` | learning rate scheduler proto | soft_start | The learning rate schedule for the trainer. Currently, | annealing: 0.0-1.0 and greater than soft_start -- Soft_start: 0.0 - 1.0 | | | | _annealing | LPRNet only supports the softstart annealing learning rate schedule. It may be | | | | | _schedule | configured using the following parameters: | A sample lr plot for a soft start of 0.3 and annealing of 0.1 is shown | | | | | | in the figure below. | | | | | | | | | | | | | | | | | | * :code:`soft_start` (float): The time to ramp up the learning rate from the minimum learning rate to the maximum learning rate | | | | | | * :code:`annealing` (float): The time to cool down the learning rate from the maximum learning rate to the minimum learning rate | | | | | | * :code:`minimum_learning_rate` (float): The minimum learning rate in the learning rate schedule. | | | | | | * :code:`maximum_learning_rate` (float): The maximum learning rate in the learning rate schedule. | | +---------------------------+-------------------------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+ | :code:`regularizer` | regularizer proto config | | This parameter configures the type and the weight of the regularizer to be used during training. The two parameters | The supported values for type are: | | | | | include: | | | | | | | | | | | | | | | | | | | | | * NO_REG | | | | | * type: The type of the regularizer being used. | * L1 | | | | | * weight: The floating point weight of the regularizer. | * L2 | +---------------------------+-------------------------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+ eval_config ^^^^^^^^^^^ The :code:`eval_config` parameter defines the hyperparameters of the evaluation process. The metric for evaluation is the license plate recognition accuracy. The recognition will be regarded as correct if all the characters in a license plated are classified correctly. .. code:: eval_config { validation_period_during_training: 5 batch_size: 1 } +-------------------------------------------+------------------+-----------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------+ | **Parameter** | **Datatype** | **Default/Suggested value** | **Description** | **Supported Values** | +-------------------------------------------+------------------+-----------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------+ | :code:`validation_period_during_training` | int | 5 | The interval at which evaluation is run during training. The evaluation is | 1 - total number of epochs | | | | | run at this interval starting from the value of the first validation epoch | | | | | | parameter as specified below. | | +-------------------------------------------+------------------+-----------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------+ | :code:`batch_size` | int | 1 | The number of samples to do a single inference on | >0 | +-------------------------------------------+------------------+-----------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------+ augmentation_config ^^^^^^^^^^^^^^^^^^^ The :code:`augmentation_config` parameter contains the hyperparameters for augmentation during training. It also defines the spatial size of the network input. .. code:: augmentation_config { output_width: 96 output_height: 48 output_channel: 3 max_rotate_degree: 5 rotate_prob: 0.5 gaussian_kernel_size: 5 gaussian_kernel_size: 7 gaussian_kernel_size: 15 blur_prob: 0.5 reverse_color_prob: 0.5 keep_original_prob: 0.3 } +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | **Parameter** | **Datatype** | **Default/Suggested value** | **Description** | **Supported Values** | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`output_width` | unsigned int | 96 | The width of preprocessed images. The width of network input | >0 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`output_height` | unsigned int | 48 | The height of preprocessed images. The height of network input | >0 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`output_channel` | unsigned int | 3 | The channel of preprocessed images | 1, 3 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`keep_original_prob` | float | 0.3 | The probability for keeping original images. | 0.0 ~ 1.0 | | | | | Only resized will be applied to am image with this probability | | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`max_rotate_degree` | unsigned int | 5 | The maximum rotation angle for augmentation | 0.0 ~ 90.0 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`rotate_prob` | float | 0.5 | The probability for rotating the image | 0.0 ~ 1.0 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`gaussian_kernel_size` | unsigned int | 5, 7, 15 | The kernel size of the Gaussian blur | >0 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`blur_prob` | float | 0.5 | The probability for blurring the image | 0.0 ~ 1.0 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ | :code:`reverse_color_prob` | float | 0.5 | The probability for reversing the color of the image | 0.0 ~ 1.0 | +-------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+----------------------------------------------------+ dataset_config ^^^^^^^^^^^^^^ The :code:`dataset_config` parameter defines the path to training dataset, validation dataset, and characters list file. .. code:: dataset_config { data_sources: { label_directory_path: "/path/to/train/labels" image_directory_path: "/path/to/train/images" } characters_list_file: "/path/to/lp_characters" validation_data_sources: { label_directory_path: "/path/to/test/labels" image_directory_path: "/path/to/test/images" } } +---------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+------------------------------------------------------------------+ | **Parameter** | **Datatype** | **Default/Suggested value** | **Description** | **Supported Values** | +---------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+------------------------------------------------------------------+ | :code:`data_sources` | dataset proto | | The path to the training dataset images and labels: | | | | | | | | | | | | | | | | | | | | | | | | | * :code:`label_directory_path`: The path to the label directory | | | | | | * :code:`image_directory_path`: The path to the image directory | | +---------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+------------------------------------------------------------------+ | :code:`validation_data_sources` | dataset proto | | The path to the validation dataset images and labels | | +---------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+------------------------------------------------------------------+ | :code:`characters_list_file` | string | | The path to the characters list file | The characters in this file should be in :code:`unicode` format. | +---------------------------------+------------------+-----------------------------+-------------------------------------------------------------------------------------------------+------------------------------------------------------------------+ .. Note:: :code:`data_sources` and :code:`validation_data_sources` are both repeated fields. Multiple datasets can be added to sources. Training the Model --------------------------- Use the following command to run LPRNet training: .. code:: tlt lprnet train -e -r -k [--gpus ] [--gpu_index ] [--use_amp] [--log_file ] [-m ] [--initial_epoch ] Required Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-e, --experiment_spec_file`: The path to the experiment spec file * :code:`-r, --results_dir`: The path to a folder where the experiment outputs should be written. * :code:`-k, --key`: The user-specific encoding key to save or load a :code:`.tlt` model. Optional Arguments ^^^^^^^^^^^^^^^^^^ * :code:`--gpus`: The number of GPUs to be used in the training in a multi-GPU scenario (default: 1). * :code:`--gpu_index`: The GPU indices used to run the training. We can specify the GPU indices used to run training when the machine has multiple GPUs installed. * :code:`--use_amp`: A flag to enable AMP training. * :code:`--log_file`: The path to the log file. Defaults to :code:`stdout`. * :code:`-m, --resume_model_weights`: The path to a pretrained model or a model to continue training. * :code:`--initial_epoch`: The epoch to start training. Here's an example of using the LPRNet training command: .. code:: tlt lprnet train --gpu_index=0 -e $DEFAULT_SPEC -r $RESULTS_DIR -k $YOUR_KEY .. Note:: To resume training from a :code:`.tlt` model, set :code:`-m` to the model and set :code:`--initial_epoch` to the starting epoch number. Evaluating the model --------------------------- The evaluation metric of LPRNet is recognition accuracy. A recognition will be regarded accurate if all the characters in the license plate are correct. Use the following command to run LPRNet evaluation: .. code:: tlt lprnet evaluate -m -e [-k ] [--gpu_index ] [--log_file ] [--trt] Required Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-m, --model`: :code:`.tlt` model or :code:`TRT` engine to be evaluated. * :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment. This should be the same as a training spec file. Optional Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-h, --help`: Show this help message and exit. * :code:`-k, --key`:The encoding key for the :code:`.tlt` model. * :code:`--gpu_index`: The GPU index used to run the evaluation. You can specify the GPU index used to run evaluation when the machine has multiple GPUs installed. Note that evaluation can only run on a single GPU. * :code:`--log_file`: The path to the log file. Defaults to :code:`stdout`. * :code:`--trt`: Evaluate the TRT engine. Here's an example of using the LPRNet evaluation command: .. code:: tlt lprnet evaluate --gpu_index=0 -m $TRAINED_TLT_MODEL -e $DEFAULT_SPEC -k $YOUR_KEY Running Inference on the LPRNet Model -------------------------------------- Use the following command to run inference on LPRNet with :code:`.tlt` model or TensorRT engine: .. code:: tlt lprnet inference -m -i -e [-k ] [--gpu_index ] [--log_file ] [--trt] Required Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-m, --model`: The :code:`.tlt` model or TensorRT engine to do inference with * :code:`-i, --in_image_path`: The path to the license plate images to do inference with. * :code:`-e, --experiment_spec`: The experiment spec file to set up export. Can be the same as the training spec. Optional Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-h, --help`: Show this help message and exit. * :code:`-k, --key`:The encoding key for the :code:`.tlt` model. * :code:`--gpu_index`: The GPU index used to run the inference. We can specify the GPU index used to run inference when the machine has multiple GPUs installed. Note that inference can only run on a single GPU. * :code:`--log_file`: Path to the log file. Defaults to :code:`stdout`. * :code:`--trt`: Run inference with the TRT engine. Here's an example of using the LPRNet inference command: .. code:: tlt lprnet inference --gpu_index=0 -m $SAVED_TRT_ENGINE -i $PATH_TO_TEST_IMAGES -e $DEFAULT_SPEC --trt Exporting the Model --------------------------- Use the following command to export LPRNet to :code:`.etlt` format for deployment: .. code:: tlt lprnet export -m -k -e [--gpu_index ] [--log_file ] [-o ] [--data_type {fp32,fp16}] [--max_workspace_size ] [--max_batch_size ] [--engine_file ] [-v] Required Arguments ^^^^^^^^^^^^^^^^^^ * :code:`-m, --model`: The :code:`.tlt` model to be exported. * :code:`-k, --key`: The encoding key of the :code:`.tlt` model. * :code:`-e, --experiment_spec`: Experiment spec file to set up export. Can be the same as the training spec. Optional Arguments ^^^^^^^^^^^^^^^^^^ * :code:`--gpu_index`: The index of (discrete) GPUs used for exporting the model. We can specify the GPU index to run export if the machine has multiple GPUs installed. Note that export can only run on a single GPU. * :code:`--log_file`: The path to the log file. Defaults to :code:`stdout`. * :code:`-o, --output_file`: The path to save the exported model to. The default is :code:`./.etlt`. * :code:`--data_type`: The desired engine data type to generate the calibration cache if in INT8 mode. The options are :code:`fp32` or :code:`fp16`. The default value is :code:`fp32`. You can use the following optional arguments to save the TRT engine that is generated to verify export: * :code:`--max_batch_size`: The maximum batch size of TensorRT engine. The default value is :code:`16`. * :code:`--max_workspace_size`: The maximum workspace size of the TensorRT engine. The default value is :code:`1073741824(1<<30)`. * :code:`--engine_file`: The path to the serialized TensorRT engine file. Note that this file is hardware specific and cannot be generalized across GPUs. Useful to quickly test your model accuracy using TensorRT on the host. As the TensorRT engine file is hardware specific, you cannot use this engine file for deployment unless the deployment GPU is identical to the training GPU. Here's an example for using the LPRNet export command: .. code:: tlt lprnet export --gpu_index=0 -m $TRAINED_TLT_MODEL -e $DEFAULT_SPEC -k $YOUR_KEY Deploying the Model --------------------------- .. _deploying_to_deepstream_lprnet: The deep learning and computer vision models that you trained can be deployed on edge devices, such as a Jetson Xavier, Jetson Nano, or Tesla, or in the cloud with NVIDIA GPUs. `DeepStream SDK`_ is a streaming analytic toolkit to accelerate building AI-based video analytic applications. TLT is integrated with DeepStream SDK, so models trained with TLT will work out of the box with Deepstream. .. _Deepstream SDK: https://developer.nvidia.com/deepstream-sdk .. Note:: LPRNet :code:`.etlt` cannot be parsed by DeepStream directly. You should use :code:`tlt-converter` to convert the :code:`.etlt` model to optimized TensorRT engine and then integrate the engine into DeepStream pipeline. Using :code:`tlt-converter` ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The :code:`tlt-converter` is a tool that is provided with the Transfer Learning Toolkit to facilitate the deployment of TLT trained models on TensorRT and/or Deepstream. For deployment platforms with an x86 based CPU and discrete GPUs, the :code:`tlt-converter` is distributed within the TLT docker. Therefore, it is suggested to use the docker to generate the engine. However, this requires that the user adhere to the same minor version of TensorRT as distributed with the docker. The TLT docker includes TensorRT version 7.1. In order to use the engine with a different minor version of TensorRT, copy the converter from :code:`/opt/nvidia/tools/tlt-converter` to the target machine and follow the instructions for x86 to run it and generate a TensorRT engine. For the aarch64 platform, the :code:`tlt-converter` is available to download in the `dev zone`_. .. _dev zone: https://developer.nvidia.com/tlt-getting-started Here is a sample command to generate LPRNet engine through :code:`tlt-converter`: .. code:: tlt-converter -k -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 -e Through this command, an optimized TensorRT engine with dynamic input shape will be generated. (Dynamic shape of this engine: min_shape=[1x3x48x96], opt_shape=[4x3x48x96], max_shape=[16x3x48x96]. The shape format is NCHW.) Deploying the LPRNet in the DeepStream sample ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Once you get the TensorRT engine of LPRNet, you could deploy it into DeepStream's `LPDR sample`_. This sample is a complete solution including car detection, license plate detection, and license plate recognition. A configurable `CTC decoder`_ is provided in this sample. .. _LPDR sample: https://github.com/NVIDIA-AI-IOT/deepstream_lpr_app .. _CTC decoder: https://github.com/NVIDIA-AI-IOT/deepstream_lpr_app/blob/master/nvinfer_custom_lpr_parser/nvinfer_custom_lpr_parser.cpp