Article Preview
TopIntroduction
Picard (2000) has predicted that Affective Computing would be an important direction for future artificial intelligence research. Human facial expression recognition is an important task for affective computing. The American psychologist Ekman and Friesen (1971) defined seven categories of basic facial expression, which are Happy, Sad, Angry, Fear, Surprise, Disgust and Neutral. Pentland and Mase (1991) held the first attempt to use optical flow method to determine the direction of movement of facial muscles. Then, they extracted the feature vectors to achieve four kinds of automatic expression recognition including Happy, Angry, Disgust, Surprise and reached the accuracy of nearly 80%.
Hinton and Salakhutdinov (2006) published an article in “Science”, opening the door to a deep learning era. Hinton suggested that the neural network with multiple hidden layers had good ability for learning characteristics. It can improve the accuracy of prediction and classification by obtaining different degrees of abstract representation of the original data. So far, deep learning has achieved good performance in speech recognition, collaborative filtering, handwriting recognition, computer vision and many other fields (Chen & Lin, 2014).
The concept of Convolutional Neural Network (CNN) was presented by Yann LeCun (1989), where a neural network architecture was composed of two kinds of basic layers, called convolutional layers (C layers) and subsampling layers (S layers). However, many years after that, there was still no major breakthrough of CNN. One of the main reasons was that CNN could not get ideal results on large size images. But it was changed when Hinton and his students used a deeper Convolutional Neural Network to reach the optimal results in the world on ImageNet in 2012. Since then, more attention has been paid on CNN based image recognition.
In this paper, we present a method to achieve facial expression recognition based on a deep CNN. Firstly, we implement face detection by using Haar-like features and histogram equalization. Then we construct a four-layer CNN architecture, including two convolutional layers and two subsampling layers (C-S-C-S). Finally, a Softmax classifier is used for multi-classification. The structure of the paper is organized as follows: Section 2 introduces the whole system based on CNN, including the input module, the image pre-processing module, the recognition algorithm module and the output module. In Section 3, we simulate and evaluate the recognition performance of the proposed system under the influence of different factors such as network structure, learning rate and pre-processing. Finally, a conclusion is drawn.
TopFacial Expression Recognition System Based On A Cnn Structure
In this section, we introduce the whole system based on a CNN structure and describe important details of all modules including face detection, image pre-processing and recognition algorithm.