A Survey of Automatic Text Classification Based on Thai Social Media Data

A Survey of Automatic Text Classification Based on Thai Social Media Data

Tanatorn Tanantong, Monchai Parnkow
Copyright: © 2022 |Pages: 25
DOI: 10.4018/IJKSS.312578
Article PDF Download
Open access articles are freely available for download

Abstract

In the digital age, the information on social media, such as Facebook, Twitter, and Instagram, is increasing rapidly. Therefore, it has led to studies and researches on social media analytics to extract useful models or knowledge from the data. One of the most interesting topics in social media analytics is text classification on social media data. However, since social media data has a diverse and complex data structure, text analysis and classification are considered a challenging issue that requires a specific technique to implement. The objective of this review paper is to collect and review research related to the automatic classification of Thai text on social media by presenting and explaining the process of text classification on various issues. These include data collection and data sources, amount of data and data preparation for research, feature extraction methods, text classification automated modeling methods, efficacy evaluation and measurement methods, the results of text classification, and summary of the overall trend of research on the topic.
Article Preview
Top

Introduction

Social media networks, the main means of communication between people across the world, can meet the needs of its users in many ways (Wichitboonyarak, 2011; Al-Ibrahim & Alzamil 2019; Saggu & Sinha 2020; Verma et al., 2021). For example, people can express opinions, exchange information, or communicate via text, audio, and video (Chandran & Madhu 2021; Tanantong et al., 2021). In addition, users can access social networks like Facebook, Twitter, and Instagram through tablets, smartphones and notebooks (“Social Networks: An Introduction,” 2009). People can use social media to share their feelings or opinions through posts, comments, and short messages (Aggarwal & Zhai, 2012; Champihom, 2018). According to DataReportal (2020), 75% of the Thai population average three hours of social media use per day per person. Thailand has the eighth largest number of Facebook users in the world, with approximately 41 million users (Positioning, 2020a). On Twitter, Thai users ranked 15th in the world, with approximately 5.7 million accounts (Positioning, 2020b). Popular social media platforms in Thailand include Facebook, Instagram, Pantip, YouTube, and expanding news sites. The number of users and amount of information on social networks are increasing at a rapid pace (Ikonomakis et al., n.d.; Tampakas, 2005). Social media analytics can identify patterns or extract useful knowledge from the significant amount of data on these platforms ‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬(AR_Group, 2020)‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬.

Text classification is an important and widely used social media analytics technique to understand and automatically classify data. This method flags messages according to the relevance of content. Groups or categories of messages are predefined. Text classification techniques can be applied to messages on social media platforms and divided into a text classification by topic group and sentiment analysis group (Thongied & Netisopakul, 2017; MonkeyLearn, 2020b). Classification by topic divides text according to topics defined by the researcher, such as the classification of type of news (Viriyavisuthisakul et al., 2015; Jotikabukkana et al., 2016). This, in turn, can categorize the news into types like economics, entertainment, foreign news, etc. Type of research can categorize messages into legal topics like defamation and nondefamation (Rao & Spasojevic, 2016; Arreerard & Senivongse, 2018; Yuenyong et al., 2018). Research that classifies messages into types of goods and services can identify groups who will or will not buy or use a product (Chumwatana, 2015; Thetmueang & Chirawichitchai, 2017; Apichai et al., 2018). The sentiment analysis classification focuses on the feelings or intention of the author. This can be divided into two types. The first classification targets the author’s feelings toward a product, service, person, place, or event. These feelings may be positive feelings, negative feelings, or neutral sentiments (Pugsee & Niyomvanich, 2015; Songram, 2016; Kuhmanee et al., 2017; Kunpattanasopon, 2018; Tanantong et al., 2020). The second classification targets the author’s intended emotion. Examples include happiness, sadness, loneliness, shock, and fear (Sarakit et al., 2015; Hemtanon & Kittiphattanabawon, 2019; Panawas, 2019).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing