Article Preview
Top1. Introduction
Sentiment analysis is a computational method of identifying or categorizing opinions expressed in a text, which is also one of the very active fields of research (Manning et al., 2008). Text obtained from different sources like user reviews and micro blogs express user’s view or attitude towards the particular product or event etc. Sentiment analysis of small text is challenging because they are contextually limited. Decisions are taken on the basis of limited number of words used by the user. We deal with Sentiment Analysis as a supervised learning process where each data element (text reviews) are labeled as either ‘positive’ or ‘negative’ (Pang, Lee, & Vaithyanathan, 2002). Machine learning models are trained with word embeddings on these datasets and their accuracy is measured on the basis of their performance.
Artificial neuron network, a computational model developed on the basis of the structure and functions of biological neural networks, has achieved huge success over other machine learning techniques in sentiment analysis (Yoon, 2014; Socher, Pennington, Huang, Ng, & Manning, 2011; Xiong, Zhong, & Socher, 2002). Deep neural networks (DNNs) have recently achieved significant performance gains in a variety of NLP tasks such as language modeling (Bengio, Ducharme, Vincent, & Jauvin, 2003), sentiment analysis (Socher et al., 2013), syntactic parsing (Collobert & Weston, 2008), and machine translation (Lee, Cho, & Hofmann, 2016). A recurrent neural network (RNN) is a special type of neural network, where connections are made between units which form a directed cycle, which allows it to exhibit a dynamic temporal behavior for the model. An RNN has an Input layer, variable number of hidden layers and finally one output layer. Basic RNNs are a network of neuron-like nodes, each with a directed (one-way) connection to every other node in which all connection (synapse) has a modifiable real-valued weight. These weights are constantly updated through successive iterations of the neural network. RNN are mostly used in hand writing recognition and speech recognition. For text classification purpose RNN is certainly more effective than any other variations of neural networks in practice.
A special variation of RNN, long short term memory (LSTM) networks is discussed. LSTM showed a striking accuracy in language modeling and speech recognition. We will be varying different forms of LSTM for our text classification purpose. A LSTM network contains LSTM units along with the input and output network layer units. A LSTM unit is capable of remembering values for either long or short time periods (Hochreiter & Schmidhuber, 1997) and it uses no activation function within its recurrent components. The stored value is not iteratively squashed over time, thus solving the vanishing gradient problem. LSTM blocks contain three or four “gates” that control information flow, implemented using the logistic function to compute a value between 0 and 1. An “input” gate controls the extent to which a new value flows into the memory, a “forget” gate controls the extent to which a value remains in memory, and an “output” gate controls the extent to which the value in memory is used to compute the output activation of the block.