Machine Learning in Sentiment Analysis Over Twitter: Synthesis and Comparative Study

Machine Learning in Sentiment Analysis Over Twitter: Synthesis and Comparative Study

Kadda Zerrouki
DOI: 10.4018/978-1-6684-6303-1.ch048
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Social networks are the main resources to gather information about people's opinions and sentiments towards different topics as they spend hours daily on social media and share their opinions. Twitter is a platform widely used by people to express their opinions and display sentiments on different occasions. Sentiment analysis's (SA) task is to label people's opinions as different categories such as positive and negative from a given piece of text. Another task is to decide whether a given text is subjective, expressing the writer's opinions, or objective. These tasks were performed at different levels of analysis ranging from the document level to the sentence and phrase level. Another task is aspect extraction, which originated from aspect-based sentiment analysis in phrase level. All these tasks are under the umbrella of SA. In recent years, a large number of methods, techniques, and enhancements have been proposed for the problem of SA in different tasks at different levels. Sentiment analysis is an approach to analyze data and retrieve sentiment that it embodies. Twitter sentiment analysis is an application of sentiment analysis on data from Twitter (tweets) in order to extract sentiments conveyed by the user. In the past decades, the research in this field has consistently grown. The reason behind this is the challenging format of the tweets, which makes the processing difficult. The tweet format is very small, which generates a whole new dimension of problems like use of slang, abbreviations, etc. The chapter elaborately discusses three supervised machine learning algorithms—naïve Bayes, k-nearest neighbor (KNN), and decision tree—and compares their overall accuracy, precisions, as well as recall values; f-measure; number of tweets correctly classified; number of tweets incorrectly classified; and execution time.
Chapter Preview
Top

Go Alec, Richa Bhayani, and Lei Huang used the first studies on the classification of polarity in tweets was (Go, 2009). The authors conducted a supervised classification study on tweets in English, using the emoticons (e.g. “:)”, “:(”, etc.) as markers of positive and negative tweets.

(Read, 2005) employed this method to generate a corpus of positive tweets, with positive emoticons “:)”, and negative tweets with negative emoticons “:(”. Subsequently, they employ different supervised approaches (SVM, Naïve Bayes and Maximum Entropy) and various sets of features and conclude that the simple use of anagrams leads to good results, but it can be slightly improved by the combination of unigrams and bigrams.

Complete Chapter List

Search this Book:
Reset