Aspect Based Sentiment Analysis of Unlabeled Reviews Using Linguistic Rule Based LDA

Aspect Based Sentiment Analysis of Unlabeled Reviews Using Linguistic Rule Based LDA

Nikhlesh Pathik, Pragya Shukla
Copyright: © 2022 |Pages: 19
DOI: 10.4018/JCIT.20220701.oa3
Article PDF Download
Open access articles are freely available for download

Abstract

In this digital era, people are very keen to share their feedback about any product, services, or current issues on social networks and other platforms. A fine analysis of these feedbacks can give a clear picture of what people think about a particular topic. This work proposed an almost unsupervised Aspect Based Sentiment Analysis approach for textual reviews. Latent Dirichlet Allocation, along with linguistic rules, is used for aspect extraction. Aspects are ranked based on their probability distribution values and then clustered into predefined categories using frequent terms with domain knowledge. SentiWordNet lexicon uses for sentiment scoring and classification. The experiment with two popular datasets shows the superiority of our strategy as compared to existing methods. It shows the 85% average accuracy when tested on manually labeled data.
Article Preview
Top

Introduction

Nowadays, people are very expressive on the web. Due to the exponential growth in user feedback data, it becomes necessary for every product and service provider to perform the mining of these feedbacks. People regularly share their views on current activities on Twitter or similar platforms. A fine-grained analysis of these tweets or reviews can give a clear picture of what people think about a particular topic. That is why aspect based sentiment analysis has gained popularity, and a lot of work has been done in this area in the last decade. Still, it is an active research area, especially unsupervised approaches that require improvements (Yue et al. ,2019, Do et al. , 2019).

The primary differentiation in sentiment analysis and aspect specific sentiment analysis is that the former only detect the sentiment of an overall text. Later, investigate each text sentence to find out various aspects and then determine the emotion associated with each of them. We can say, instead of evaluating the overall sentiment of a text, an aspect based approach allows us to associate specific opinions with various aspects or features of a product and service. The aspect based analysis looks more closely at the information behind a text. That is why results are more detailed and accurate.

Suppose we consider the example of COVID public sentiment analysis based on social network data. Then we require to analyze the various issues or aspects related to COVID and public sentiment about that. Here overall polarity may not be a good indicator. We required sentiment about a particular issue. The same analysis is required for every business and service related feedbacks or opinion. Due to regularly generating massive feedback data, unsupervised and semi-supervised approaches are gaining popularity.

Topic modeling is an unsupervised NLP technique representing a group of text documents with several topics that can best explain the underlying information in each document. It seems similar to clustering with a difference. Instead of numerical features, it has a collection of words. These words need to be grouped so that each group represents a topic in a document. Latent Dirichlet Allocation (LDA) is the most well-known method for modeling thematic information, i.e., topics from the document collection. It is an unsupervised learning approach that views documents as a bag of words. LDA is used in an extensive collection of documents to classify topics(Beli et al., 2003). It is helpful for Search Engine Optimization, automation of customer service, and any other instance where knowing the theme of documents is essential. It applies to the role of describing topics that best represent a collection of documents. During the topic modeling method, these topics emerge and are therefore named latent.

The main contributions to this work include the following:

  • 1.

    An unsupervised aspect extraction approach using optimized LDA configuration and Parts of Speech (POS) rule for unlabeled reviews.

  • 2.

    Categorization of aspects, using very few domain words.

  • 3.

    Aspect specific analysis of sentiment using SentiWordNet(SWN).

The remaining structure of the paper is as follows: Section 2 sheads light on the latest work in the field. The background and intuition of LDA and SWN described in section 3. The methodology and proposed algorithms are explained in section 4. Section 5 presented experimental details and results. The paper concluded with summarization and future directions in Section 6. In this paper, the word sentiment and opinion are used interchangeably, similarly word aspect and feature.

Top

For this study, topic modeling based approaches are mainly considered for sentiment analysis. Some hybrid models based on deep neural networks and LSTM are also discussed. We focused on very recent work of the last 3-4 years in this field.

The various survey describes the present state of arts in sentiment analysis research, mostly online reviews, and social media data(Yue et al., 2019). Detailed analysis of different Deep learning based approaches discussed along with their performance issues(Do et al., 2019). LDA was presented by Blei et al. (2003), and even after almost two decades, it is still increasing its popularity in unsupervised topic extraction.

Complete Article List

Search this Journal:
Reset
Volume 26: 1 Issue (2024)
Volume 25: 1 Issue (2023)
Volume 24: 5 Issues (2022)
Volume 23: 4 Issues (2021)
Volume 22: 4 Issues (2020)
Volume 21: 4 Issues (2019)
Volume 20: 4 Issues (2018)
Volume 19: 4 Issues (2017)
Volume 18: 4 Issues (2016)
Volume 17: 4 Issues (2015)
Volume 16: 4 Issues (2014)
Volume 15: 4 Issues (2013)
Volume 14: 4 Issues (2012)
Volume 13: 4 Issues (2011)
Volume 12: 4 Issues (2010)
Volume 11: 4 Issues (2009)
Volume 10: 4 Issues (2008)
Volume 9: 4 Issues (2007)
Volume 8: 4 Issues (2006)
Volume 7: 4 Issues (2005)
Volume 6: 1 Issue (2004)
Volume 5: 1 Issue (2003)
Volume 4: 1 Issue (2002)
Volume 3: 1 Issue (2001)
Volume 2: 1 Issue (2000)
Volume 1: 1 Issue (1999)
View Complete Journal Contents Listing