A Topic Modeling-Guided Framework for Aspect-Oriented Sentiment Analysis on Social Media

Social media platforms have incorporated more than half of the world's population, making it one of the most data-rich domains recently. The sentiments expressed by social media users hold great significance for various reasons, such as the identification of public opinion on a product or towards a governmental policy, to name a few. There are different domains where companies use social media sentiments to gather feedback from customers to provide them with better products and services. Only a few attempts have been reported on aspect-based sentiment analysis literature on sentiment analysis and opinion mining. This chapter proposes a framework for aspect-based sentiment analysis for social media using a topic modeling-powered approach. The experiments conducted on real-world datasets show that the proposed framework outperforms some existing works on aspect-oriented sentiment analysis.
Sentiment analysis techniques are now being used in business and social domains to identify and understand human behaviour. The need to identify the sentiments from social media platforms such as social networks and discussion forums is also increasing. Sentiment analysis attempts to identify the opinion or view over a topic or event using natural language processing. These opinions are subjective expressions of people and are not facts. Text sentiment analysis has recently focussed on the vast amount of unstructured social media data available through online platforms. The application areas of sentiment analysis include product reviews, political opinions, law-making, and psychology (Alessia et al., 2015). For instance, product review data is now available on almost all e-commerce sites and enables a better user experience for customers by providing feedback on the available products. Companies' continuous opinion tracking on products can help them get real-time feedback on their market performance. This feedback can thus help companies plan their product and brand improvements. Sentiment analysis can help companies access actionable facts and figures, essential for improving their online image. Organizations in the e-commerce space compete to offer better customer experiences as there is an enormous growth in the online purchase of products. These sites provide innovative solutions like product recommendations and a comparison of products.

The data mined from social media platforms such as Twitter can provide valuable insights into global events like the Covid-19 pandemic (Boon-Itt & Skunkan, 2020). Moreover, these platforms enable feedback regarding the user experience and opinions. Product level feedback may not provide accurate opinions on individual product features due to the diverse nature of the review. Aspect level sentiment analysis can be used in this scenario and focuses on analyzing the sentiments at individual aspect levels (Pontiki et al., 2016). Twitter data can be used to track healthcare-related issues such as the spread of diseases and the general public's awareness regarding health advisories. Topic Modeling and sentiment analysis of tweets can be used to identify common discussion themes related to the pandemic. There can be various classification levels such as word, sentence, document, or aspect. Specific terms and phrases can be considered polarity keywords to identify the sentiment they convey. Data-driven methods can be used to find the keywords, finding the relationship between words and reviews. Sentiment analysis on social media like Twitter is usually done to understand the polarity of the users' various topics. For example, analysis of a current trending topic such as a newly released movie or album can be helpful to understand the overall sentiment towards it (Yue et al., 2019).

Similarly, the sentiment or opinion about a product or company can be mined from Twitter data. Some previous works have attempted to perform sentiment analysis on tweets obtained by refining Twitter API queries. A tweet dataset is created by filtering out tweets containing particular hashtags or user profiles in such works (Agarwal et al., 2011). The challenge with raw tweets is that the data must be pre-processed before inputting to a sentiment analysis module. The raw tweets are character-limited by Twitter to 280 characters as of 2021 and usually contain emojis and URLs which need to be removed. A difficult task is identifying the semantics and extracting opinions from such short texts. Ambiguity is another issue while dealing with short tweets and affects the performance of the sentiment analysis model (Saif et al., 2012). Few tweet datasets are publicly available, and they were annotated by crowdsourcing. Contributors were given the raw tweets and then made to classify tweets as positive, negative, or neutral. Such sentiment-labelled tweet datasets can be used for supervised models.

