A Probabilistic Deep Learning Approach for Twitter Sentiment Analysis

A Probabilistic Deep Learning Approach for Twitter Sentiment Analysis

Mostefai Abdelkader
Copyright: © 2020 |Pages: 14
DOI: 10.4018/IJDAI.2020070102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In recent years, increasing attention is being paid to sentiment analysis on microblogging platforms such as Twitter. Sentiment analysis refers to the task of detecting whether a textual item (e.g., a tweet) contains an opinion about a topic. This paper proposes a probabilistic deep learning approach for sentiments analysis. The deep learning model used is a convolutional neural network (CNN). The main contribution of this approach is a new probabilistic representation of the text to be fed as input to the CNN. This representation is a matrix that stores for each word composing the message the probability that it belongs to a positive class and the probability that it belongs to a negative class. The proposed approach is evaluated on four well-known datasets HCR, OMD, STS-gold, and a dataset provided by the SemEval-2017 Workshop. The results of the experiments show that the proposed approach competes with the state-of-the-art sentiment analyzers and has the potential to detect sentiments from textual data in an effective manner.
Article Preview
Top

1. Introduction

It is very important to business industries and organizations to know opinions, thoughts, emotions, or sentiments expressed in texts by peoples using microblogging platforms such as twitter since it allows better decision making (Giachanou et al. 2016). This activity is commonly referenced as sentiment analysis (i.e., opinion mining) and it is an active area of research in recent years.

Sentiment analysis is a research field in the area of text mining and natural language processing that aims at analyzing texts that express opinions, sentiments and emotions of people’s about a topic (e.g., services, individuals, products, issues) (Giachanou et al. 2016; Abbasi et al. 2008; Nasukawa & Yi, 2003). If the analyzed text holds a sentiment, then it is viewed as polar (positive or negative), otherwise, it is viewed neutral ( Zimbra, 2018). The analysis process consists of automatically predicting such polarity.

Nowadays, large textual data expressing people ‘s opinions and thoughts on products, services, or any topic can be found on social media platforms such as twitter which is a widely known microblogging platform. Thus it is not necessary to get them using traditional approaches such as surveys that are costly processes.

This situation constitutes a valuable opportunity for organizations to know about the quality of their services and products by analyzing user’s texts to infer their opinions (Zimbra et al. 2015, Forman et al. 2008).

To organizations and business industries, tweets are viewed as the medium that contains evaluations of business and society works (Jansen et al. 2009; Gleason, 2013). For example, Organizations conduct sentiment analysis processes to study product sales (Rui et al. 2013), stock market movements (Bollen et al. 2011), and so on.

Unfortunately, While it is very easy to collect textual data that express user’s opinions on a topic on twitter, It still is very difficult to analyze them to get information about people’s sentiments. The achieved accuracies by the proposed techniques were found under 70%, which limits their applicability in real situations (Zimbra, 2018; Minghui et al. 2020; Kiritchenko et al. 2014; Santos et al.2014; Hassan et al. 2017). The difficulty Sources are tweet length, the used abbreviations, the problem of spelling errors, and the existence of special characters(Hassan et al. 2017; Kiritchenko et al. 2014).

So far, many approaches have been proposed to infer sentiments from textual data. These approaches are classified into three classes machine-learning approaches that include deep learning one, Lexicon-based and hybrid (Gupta & Joshi, 2019; Musto et al. 2014; Zimbra, 2018; Giachanou et al. 2016; Turney, 2002; Kim & Hovy, 2004; Tang et al. 2015).

Deep learning approaches which are class of neural networks with multi-hidden layers such as a convolutional neural network (CNN) have been found effective in many areas such as computer vision, natural language processing(Graves et al. 2013; Kalchbrenner et al. 2014). Due to this success, many deep learning approaches have been proposed to replicate this success in the sentiment analysis domain (Zhang et al. 2018). The performance of these deep learning models is dependent on many hyperparameters and an important one of them is the choice of the word embeddings model which is a vector representation of data (Bengio et al. 2003; Zhang et al. 2018; Kim, 2014).

This paper proposes a probabilistic deep learning approach for sentiments analysis. The deep learning model used is a Convolutional Neural Network (CNN). The main contribution of this work is a new probabilistic representation of the textual data (tweet) to be fed to the CNN. This representation is a matrix composed of two vectors. The first vector stores for each word composing the message the probability that it belongs to a positive class. The second one stores for each word composing the message, the probability that it belongs to a negative class. The system is evaluated on four well-known datasets: HCR OMD and STS-gold, and a dataset provided by the SemEval-2017 Workshop(subtask B). The results of the experiments show that the proposed approach can compete with the state of the art tools for sentiment analysis.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 2 Issues (2022)
Volume 13: 2 Issues (2021)
Volume 12: 2 Issues (2020)
Volume 11: 2 Issues (2019)
Volume 10: 2 Issues (2018)
View Complete Journal Contents Listing