A Dynamic Strategy for Classifying Sentiment From Bengali Text by Utilizing Word2vector Model

Mafizur Rahman, Md. Rifayet Azam Talukder, Lima Akter Setu, Amit Kumar Das
DOI: 10.4018/JITR.299919
In today's world, around 230 million people used the Bengali or Bangla language to communicate. These individuals are progressively associated with online exercises on famous micro-blogging and long-range interpersonal communication locales, imparting insights what's more, musings, and also the vast majority of articles are in the Bengali language. Thus, Bengali people express their emotions using the Bangla language by reviewing, commenting, or recommendations. Sentiment analysis helps determine the people's emotions expressed on social media or several online platforms. Therefore, this study focused on extracting their emotion from a Bengali text by utilizing Word2vector, Skip-Gram, and Continuous Bag of Words (CBOW) with a new Word to Index model by focusing on three individual classes happy, angry, and excited. The authors achieved the highest accuracy of 75% by utilizing the skip-gram model to classify those three types of emotions. This study also outperformed other existing works with LSTM, CNN model with existing datasets.
Sentiment Analysis is a dynamic area of research in text mining for analyzing text behavior. It is a broad area of natural language processing that finds out the level or polarity of comments or opinions made by some people or a specific group of people (Al-Amin, Islam & Uzzal, 2017) (Tuhin, Paul, Nawrine, Akter & Das, 2019). It is the initial implementation of NLP, text analytics, and computational linguistics to recognize and take out personal information in origin materials. The privileges of this analysis in this contemporary world are known as a proper decision of customer product review, rating, and comments. There are also numerous names with a little different task of sentiment analysis. Those are opinion extraction, subjectivity analysis, sentiment mining, emotion analysis, effect analysis, review mining, etc. (Tripto & Ali, 2018). Almost 250 million people communicate in the Bangla language globally, and this Bangla language is in the sixth position in the world. It is one of the essential Indo-Iranian languages (Nabi, Altaf & Ismail, 2016) (Biswas, Das, 2019). It is also the second language in India. In Bangladesh, around 160 million people live, and Bangla is their first language (Soron, 2016) (Islam, Mubassira, Islam & Das, 2019). Out of those 160 million citizens, 63.3 million people use the Internet, and 26 million people are very active in social media (Rahman, 2020) . This number is rapidly growing enormously. These enormous numbers of users cause rapid growth of recommendations, conversations, reviews, comments, ratings, and other forms of social media, and most of them are in Bengali, Banglish(Mixture of Bengali and English language) or Romanized the Bengali language (Sumit, Hossan, Al Muntasir & Sourov, 2018) . Those people express their opinions on social media using the Bangla language. Therefore, sentiment analysis is a great fascination for the researchers for analyzing behavior on social media such as Facebook, Twitter, Google+ as well (Hasan, Islam Mashrur-E-Elahi & Izhar, 2013) (Mumu, Munni & Das, 2021).

