A Learning Framework for Target Detection and Human Face Recognition in Real Time

Jiaxing Huang (University of Glasgow, Glasgow, UK), Zhengnan Yuan (University of Glasgow, Glasgow, UK), and Xuan Zhou (University of Glasgow, Glasgow, UK)

Source Title: International Journal of Technology and Human Interaction (IJTHI) 15(3)

DOI: 10.4018/IJTHI.2019070105

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Inspired by the function, mechanism and efficiency of the visual nerve system of human beings, a revolutionary detection and reorganization method named YOLO is present to provide an accurate, stable and fast arithmetic for a variety of targets, be it target detection for unmanned vehicle, car license recognition and optimization for surveillance. The traditional method for object detection is to reuse the classifier to implement detection, in contrast, the method named YOLOV2 process this problem by considering it in the mathematical area as a regression of spatially discrete bordered areas and relative class probability. However, as a cost of stable and fast response of this arithmetic, inaccurate detection maybe caused by YOLOV2 when the detected object is tiny (e.g., face recognition in surveillance). In this article, the authors provide a new method to further improve the performance of YOLOV2 by utilizing the accurate, stable and fast properties of YOLOV2 and editing the original code of YOLOV2 to eliminate the inaccuracy of tiny object detection, and implement this method on an embedded system.

Article Preview

Top

1. Introduction

Target detection and face recognition, one of the most challenging issue in the field of computer vision, devotes at finding the correct position of the targets and deciding their subordinate categories (even recognize the face if detected). Artificial learning, a developing application which plays an important role in intelligent monitor, behavior recognition and flow detection has become a hot topic, especially its theory and application of deep learning. Comparing with traditional target detection method, utilizing deep learning in new target detection model has gradually became the main trend for research.

The traditional target detection method is generally divided into three stages. In first process, the features of some given graph are extracted from the candidate region that are selected on the image before. Then the classification is completed by the trained classifier. In the second process, manual acquisition is often required in extracting the expression information related to the target in the original input before the expression information doing classifier learning in the extracted feature information which is related to the target information (Carpenter, Grossberg, & Reynolds, 1991). For instance, SIFT (Scale-invariant feature transform) (Lowe, 2004), HOG (Histogram of gradient) (Dalal & Triggs, 2005). However, limitations also exit. On one aspect, artificial methods mainly depend on specific detection tasks. For different targets or the same target of different forms of objects, designers are expected to create desired methods of extracting the features and the final recognition performance of the model is also associate with their own experience (Ouyang & Wang, 2013). On other aspect, feature extraction and classification training are separated by the traditional test model (Sermanet, 2013). As a result, some useful information may not never be recovered from the train of classification once the features that are manually extracted in feature description are not minute enough to describe the target. These shortcomings prevent the traditional detection model from getting more consistent description with the target characteristics.

Before introducing convolutional neural networks, DPM (Deformable Parts Model) (Felzenszwalb, Girshick, Mcallester, & Ramanan, 2010) is a classical algorithm for target detection, whose core theory is achieving the combination with SVM (Zhu, Zheng, & Savvides, n.d.). In some testing tasks, DPM has desired results. CNN (Convolutional Neural Network) in deep learning combines artificial neural networks with convolution operations to recognize a wide variety of target model and is robust to a certain degree of distortion and deformation. Simultaneously, It uses sparse connections and weight sharing which greatly reducing the parameters of the traditional neural network parameters. That also the reason why many detection methods or models based on convolutional neural networks achieved significant results. In 2014, R-CNN (Girshick et al., 2014) algorithm, indicated by Girshick, introduced CNN in the field of target detection for the first time. Expectedly, the results are far better than the traditional approach. After R-CNN, other classical methods exist, be it SPPNet (He, Zhang, Ren, & Sun, 2015), Faster R-CNN (Ren et al., 2015), R-FCN (Dai, Li, He, & Sun, 2016), FPN (Lin et al., 2016), Mask RCNN (He et al., 2017), YOLO (Redmon, Divvala, Girshick, & Farhadi, 2016), SSD (Liu et al., 2016), YOLOv2 (Redmon & Farhadi, 2016). These algorithms can be divided into two categories: deep learning target detection algorithm based on candidate region and deep learning target detection algorithm based on regression method.

The deep learning target detection algorithm based on candidate region solves two problems about the traditional target detection. Firstly, lack of pertinence, time complexity and window redundancy for the region selection strategy based on sliding window. Secondly, lack of robustness to variations in diversity for manual design. Specifically, the method can obtain high recall rate in the case of fewer windows since the possible positions of the targets are pre-selected according to the texture and edge in the candidate region. This improvement reduces the time complexity of the subsequent operation, and obtains candidate window with higher quality than sliding window.

Complete Article List

Search this Journal:

Reset

Volume 21: 1 Issue (2025)

Volume 20: 1 Issue (2024)

Volume 19: 1 Issue (2023)

Volume 18: 7 Issues (2022): 4 Released, 3 Forthcoming

Volume 17: 4 Issues (2021)

Volume 16: 4 Issues (2020)

Volume 15: 4 Issues (2019)

Volume 14: 4 Issues (2018)

Volume 13: 4 Issues (2017)

Volume 12: 4 Issues (2016)

Volume 11: 4 Issues (2015)

Volume 10: 4 Issues (2014)

Volume 9: 4 Issues (2013)

Volume 8: 4 Issues (2012)

Volume 7: 4 Issues (2011)

Volume 6: 4 Issues (2010)

Volume 5: 4 Issues (2009)

Volume 4: 4 Issues (2008)

Volume 3: 4 Issues (2007)

Volume 2: 4 Issues (2006)

Volume 1: 4 Issues (2005)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Learning Framework for Target Detection and Human Face Recognition in Real Time

Abstract

1. Introduction

Complete Article List