Bi-Model Engagement Emotion Recognition Based on Facial and Upper-Body Landmarks and Machine Learning Approaches

Bi-Model Engagement Emotion Recognition Based on Facial and Upper-Body Landmarks and Machine Learning Approaches

Haifa F. Alhasson, Ghada M. Alsaheel, Noura S. Alharbi, Alhatoon A. Alsalamah, Joud M. Alhujilan, Shuaa S. Alharbi
Copyright: © 2023 |Pages: 13
DOI: 10.4018/IJESMA.330756
Article PDF Download
Open access articles are freely available for download

Abstract

Customer satisfaction can be measured using facial expression recognition. The current generation of artificial intelligence systems heavily depends on facial features such as eyebrows, eyes, and foreheads. This dependence introduces a limitation as people generally prefer to conceal their genuine emotions. As body gestures are difficult to conceal and can convey a more detailed and accurate emotional state, the authors incorporate upper-body gestures as an additional feature that improves the predicted emotion's accuracy. This work uses an ensemble machine-learning model that integrates support vector machines, random forest classifiers, and logistic regression classifiers. The proposed method detects emotions from facial expressions and upper-body movements and is experimentally evaluated and has been found to be effective, with an accuracy rate of 97% on the EMOTIC dataset and 99% accuracy on MELD dataset.
Article Preview
Top

Assessing emotions within the context of experience satisfaction is crucial, given that leisure satisfaction encompasses diverse cognitive, physical, and emotional experiences, making it a multidimensional construct (Mansfield et al., 2020). Emotions are reactions to stimuli that are personally relevant, and they can be expressed at three levels, including phenomenology (the subjective experience of the emotion), behavior (the actions associated with the emotion), and physiology (the bodily changes that occur during the emotional response). Thus, emotions can be measured through behavioral observation to capture actions and physiological measurements to capture bodily changes (Ekman, 2016).

Bi-Modal Emotion Recognition

While advancements in emotion recognition have been substantially propelled by machine learning, a key hurdle remains unresolved. Most techniques employed in computer vision for detecting human emotions predominantly rely on facial expressions. It is common for individuals to mask their genuine emotions; however, body gestures, which are more challenging to hide, can provide a more comprehensive and precise depiction of emotional states.

BlazePose (Bazarevsky et al., 2020) is a lightweight convolutional neural network architecture for human pose estimation that is adapted for real-time inference on mobile devices and runs at over 30 frames per second. They overcome the limitation of using Non-Maximum Suppression (NMS) in literature, making multiple, ambiguous boxes satisfy the intersection over union (IOU) threshold. Alternatively, it makes a bounding box around a relatively rigid body part. They used BlazeFace (Bazarevsky et al., 2019), BlazeHand (Bazarevsky et al., 2020), and CoCo (Kreiss et al., 2019).

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 1 Issue (2023)
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing