NVIDIA Docs Hub Homepage NVIDIA TAO NVIDIA TAO Toolkit v30.2202 LPDNet

LPDNet

The LPDNet models detect one or more license plate objects in an image and return a box around each object, as well as an lpd label for each object.

TAO Toolkit provides two pretrained LPD models: one is based on the Detectnet_v2 network; the other is based on the YOLOv4-tiny network.

The Detectnet_v2 model delivers two pretrained LPD models: one is trained on an NVIDIA-owned US license-plate dataset; the other is trained on a public Chinese City Parking Dataset (CCPD). They are based on the NVIDIA DetectNet_v2 detector with ResNet18 as the feature extractor. This architecture, also known as GridBox object detection, uses bounding-box regression on a uniform grid on the input image. The Gridbox system divides an input image into a grid that predicts four normalized bounding-box parameters (xc, yc, w, h) and confidence value per output class.

The raw normalized bounding-box and confidence detections need to be post-processed by a clustering algorithm such as DBSCAN or NMS to produce the final bounding-box coordinates and category labels.

The YOLOv4-tiny model delivers models that are trained on a NVIDIA-owned US license plate dataset. They are based on the YOLOv4-tiny detector with cspdarknet_tiny as the feature extractor.

Training Algorithm

The Detectnet_v2 LPD model is based on DetectNet_v2 from TAO Toolkit. The training algorithm optimizes the network to minimize localization and confidence loss for objects. The training is carried out in two phases: in the first phase, the network is trained with regularization to facilitate pruning; next, the network is pruned, removing channels whose kernel norms are below the pruning threshold. In the second phase the pruned network is retrained. Regularization is not included during the second phase.

The YOLOv4-tiny LPD model is based on YOLOv4-tiny from TAO Toolkit. It is trained with cspdarknet_tiny as the backbone. The training algorithm optimizes the network to minimize localization and confidence loss for objects.

Intended Use Case

The primary use case for these models is detecting license plates in a cropped color (RGB) image containing automobiles. The model can be used to detect license plates from photos and videos by using appropriate video or image decoding and pre-processing.

For more details on the two kinds of LPDNet model, refer to its model card hosted at NGC.