Article Preview
Top1. Introduction
Social networking sites are internet-based applications supporting communications for social and business purposes. These sites enable an individual user to interact with others to efficiently share personal interest, ideas, thoughts, or activities. One unique commonality to the existing social networking sites is that the user-generated content, in different forms such as photos, videos, blogs, emoticons, or text posts, is openly shared. Text posts like comments or reviews of a target product are embedded with sentiment words that can be extracted for further analyse for making purchase decisions (Goldsmith & Horowitz, 2006). The analysis on opinion strengths would be very useful to product review references because these comments are directly from the consumers (Hu and Liu, 2004; Kim and Moon, 2011; Yoo et al., 2018) and can be utilised to support product design evaluations. The number of user-generated contents in the social networking sites is increasing drastically, the sentiment analysis is emerging as a topic among researchers, regarding the capturing or summarising the text posts (Cambria et al., 2013).
Sentiment analysis, which focuses on the processing of the text for the identification of opinionated information, can handle large volumes of text posts (Mali et al., 2016). It can be used for the determination of the contextual polarity as well as the measurement of opinion strengths by searching the sentiment words in a set of text posts. Many applications using sentiment word analyses to summarise customer text posts have been successfully carried out for different product categories including digital cameras, laptops, cell phones, books, and health care products (Hu and Liu, 2004; Bucur, 2015; Kim et al., 2018).
The SentiWordNet (Guerini et al., 2013) is one of the commonly used for the determination of polarity and opinion strength. It is done by counting the number of sentiment words or summing up the sentiment scores. However, it may not be sufficient to classify a comment to be positive or negative by merely counting the number of sentiment words or determining the sentiment scores. Thus, an algorithm for categorising the comments into different polarities to support decision making is needed.
K-means (MacQueen, 1967) is a simplified approach to perform cluster analyses for multiple dimensional data. It aims to classify several data into k clusters. With its advantages for grouping the unlabeled data efficiently, the use of k-means for clustering the text posts is proposed. The text posts can be classified into three different groups, i.e. positivity, negativity, and objectivity, using sentiment analysis with the k-means algorithm. K-means can also be employed to facilitate the classification of various comments into corresponding design criteria.
The approach proposed has two distinct features. First, it offers an immediately applicable instrument for the evaluation of sentiment scores to present the results of sentiment analysis. Second, it helps to identify the critical design criteria and opinion strengths based on the user-generated content without reading all the text posts. Also, it offers a practical and prompt means for collecting feedback from the customers' perspective. The results are valuable for decision-makers to perform product analysis, especially for generating new design alternatives or revised models at the initial product development stage. The subsequent sections of the paper are organised as follows: Section 2 describes the related work of sentiment analysis and k-means for product evaluations. Section 3 outlines the procedure of the approach. Section 4 demonstrates the applicability of the method approach using a case application. Section 5 presents the results and conclusion.