Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Extraction of Emotion From Spectrograms: Approaches Based on CNN and LSTM

Cecile Simo Tala

Source Title: Global Perspectives on the Applications of Computer Vision in Cybersecurity

DOI: 10.4018/978-1-6684-8127-1.ch005

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Speech is the main source of communication between humans and is an efficient way to exchange information around the world. Emotion recognition through speech is an active research field that plays a crucial role in applications. SER is used in several areas of life, more precisely in the security field for the detection of fraudulent conversations. A pre-processing step was done on audios in order to reduce the noise and to eliminate the silence in the set of audios. The authors applied two approaches of the deep learning namely the LSTM and CNN for this domain in order to decide of the approach which saw better with the problem. They transformed treated audios into spectrograms for the model of the CNN. Then they used the technique of the SVD on these images to extract the matrices of characteristics for the entries of the LSTM. The proposed models were trained on these data and then tested to predict emotions. They used two databases, RAVDESS and EMO-DB, for the evaluation of the approaches. The experimental results proved the effectiveness of the model.

Chapter Preview

Top

Introduction

Societies are based on communication that responds to a set of rules allowing everyone to understand and be understood. These communications can be in voice, text, and gesture form. These different forms are intended to translate the thought, to represent it using a set of words chosen from a lexicon, gestures responding to a culture, and articulated sounds to form syllables, words, and sentences. This work is situated in the context of the recognition of emotions during telephone conversations between two people. In these types of interactions, the telephone is the only channel of communication. The conversation to be studied becomes important since our only channel of expression is the voice. It contains a multitude of information about the speaker such as his emotions, his age, his identity, his gender as well as the physiological disorders felt during oral expression. The extraction of this information has given rise to several areas of speech research, in particular the recognition of emotions from the voice. Speech is the most widespread means of exchanging information between human beings all over the world (Kwon, 2021), and attention should be paid to it. However, the most significant factor in human speech is emotion (Nardelli et al., 2015), which can be analyzed for judgments about humans and other expressions. Speech is the most widespread means of exchanging information between human beings all over the world (Kwon, 2021), and attention should be paid to it. However, the most significant factor in human speech is emotion Nardelli et al. (2015), which can be analyzed for judgments about humans and other expressions. Speech is the most widespread means of exchanging information between human beings all over the world (Kwon S., (2021), and attention should be paid to it. However, the most significant factor in human speech is emotion (Nardelli et al., 2015), which can be analyzed for judgments about humans and other expressions.

Emotions or emotional states are fundamental for humans insofar as they permeate humanity consciously and unconsciously in the most varied areas of life. They influence our perceptions, our behaviors, our mental states, and our daily activities such as communication, learning, and decision-making. The importance of emotions in the learning process has been known for a long time (L. Kerkeni et al, 2020), Nowadays the recognition of emotions in a speech signal is one of the most emerging areas of research and plays an important role in applications. in real time where researchers have developed methods to detect emotions from a voice signal (Kwon, 2021; (Mustaqeem and Kwon, 2019; Anvarjon et al., 2020) It paves the way for human-computer interaction (HMI) and plays an important role in many effective services such as call centers and tracking customer emotions to provide better services (Gupta, et al. (2007). In the medical field, speech-based diagnostic systems are developed to assess the extent of depression and distress (Rana et al. (2019), and some emotion recognition systems are designed for healthcare centers to monitor depression. state of the speaker for bipolar patients (Badshah et al., 2019; Wang, et al., 2015) There are so many other applications, such as multimedia search systems, (Roberts et al., 2012) forensic science (Vögel et al., 2018) smart car systems that have as their aim to improve their performance by using an effective emotion recognition system. More and more man is dependent on machines. There are several approaches to detecting an emotion from an audio, video, or text file. However, is it possible to adapt the models of emotion recognition in the audio of the telephone conversations to detect a set of emotions?

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Extraction of Emotion From Spectrograms: Approaches Based on CNN and LSTM

Abstract

Introduction

Complete Chapter List