1.1. Overview
The objective of LPR is to recognize the vehicle registration numbers from images and identify the owner or the origin of the vehicle. The acquired information can be used for stolen vehicle tracking and traffic management, such as ETC (Electronic Toll Collection), access control, traffic law enforcement, and parking lot management. A typical LPR system consists of license plate detection, segmentation, and recognition as shown in Fig. 1-1.
Figure 1. The illustration of a typical LPR system
With the growth of surveillance cameras in Taiwan, there has been a steady decline in theft rate, e.g., 82.94% and 77.22% in motorcycle and car theft rate, respectively (Peng 2015). However, the video tracking of theft vehicles is one of, if not the most labor-intensive and time-consuming task, which turns LPR system into an indispensable part in the whole process, various LPR systems were invented consequently.
Due to the variations in luminance, contrast, noise, and geometry of the license plates captured in natural scene, traditional machine vision methods are unable to solve the problem completely. In recent years, many computer vision tasks achieved state-of-the-art results leveraging deep learning, particularly, CNN (Convolutional Neural Network) with exceptional performance in image feature extraction. By integrating different low-level features with the deep network structure, the network can detect complicated high-level features to conquer diverse environments. Despite the remarkable performance of deep learning, there still exist some challenges to be tackled.
To begin with, training deep networks with insufficient data could cause the networks to memorize all the training data, massive amounts of data are required to prevent overfitting. A network is said to be overfitted when it is much more accurate in predicting known data than new data. Due to rare old license plates such as green background with white characters in Fig. 1-2 or some seldom used characters such as “I” and “4” (same pronunciation as death in Mandarin Chinese), overfitting may occur.
Figure 2. License plate with green background and white characters
Furthermore, class imbalance problem is common in classification problem where training with disproportionate ratio of observations in each class. Unfortunately, character distributions of Taiwanese license plate are severely biased, while most machine learning algorithms work best when the training samples are equally distributed.
Finally, with the high computational complexity of deep learning, high-end GPUs (Graphics Processing Units) and similar processing units are used to ensure better efficiency and less time consumption. These processing units are delicate, costly, and most of all—power-consuming.
Since a deep learning-based LPR solution is to be deployed on moving vehicles, scooters in particularly, power consumption is a critical problem. Far from having household power, it is impractical to supply adequate processing power for general high-end GPUs on scooters. However, most state-of-the-art deep networks are too complex to have real-time performance without high-end GPUs, it is a three-way trade-off between inference speed, power consumption, and accuracy.
This chapter focuses on recognizing license plate from cameras on moving scooters. Accordingly, the main complexities faced by the LPR system are varying illumination especially low light at night and ensuing long exposure and thus blur, motion blur, scaling due to varying camera and license plate distances, skewing due to viewing angle and perspective projection, and translations from the continuously moving viewpoint. Lightweight neural networks and less power-consuming VPUs (Vision Processing Units) are chosen to reduce computational intensity and power consumption while maintaining competitive accuracies and real-time performance. This chapter aims to solve the overfitting problem and the data imbalance problem caused by the shortage of some Taiwanese license plate characters such as letter “I” (easily confused with number “1”), letter “O” (easily confused with number “0”), and “4” (same pronunciation as death in Mandarin Chinese) through AI techniques. Accuracies with or without these techniques are compared and tabulated.