Article Preview
TopIntroduction
This paper proposes an automatic approach to categorizing text using deep learning. Text categorization is the associating documents process to predefined classes (categories or labels) written in natural language using natural language processing (NLP). Many researchers have used text classification with deep learning architectures that assure high precision with less need for engineering features. The key aspect of deep learning is that the resultant layers of features are not designed by human engineers, but, rather, are learned from data using a general-purpose learning procedure.
In particular, the recurrent neural network (RNN) is a very powerful dynamic system and an important implementation mechanism of deep learning. The RNN method can find the dependencies relationship of time series that provide more effective ways for time memory to operate. Loop memory can extract valuable information from the history data through memory cell execution and other control mechanisms. The long short-term memory (LSTM) and gated recurrent units (GRUs) are two kinds of special memory cells of RNN that employ different memory cell mechanisms. LSTM and GRU networks use special hidden units whose natural function is remembering inputs for a long time (Hochreiter & Schmidhuber, 1997). However, regarding a load of power data with obvious time series and cycles characteristics, load forecasting can take advantage of history information via the LSTM and the GRU cell(Zhang,Wu et al., 2018).
This paper examines the multiclass automatic classification applying a hybrid approach by integrating convolution neural network (CNN) and bidirectional fast gated recurrent unit (BiFaGRU) termed as CNN-BiFaGRU. CNN-BiFaGRU is a supervised text classification testing on different textual databases (Reuters8, Reuters52, WebKB, 20NewsGroup, and AG NEWS) using the GloVe word embedding proposed by Pennington et al. (2014). The model is evaluated using different metrics such as accuracy, precision, recall, F1-score, and the confusion matrix. The obtained results are detailed in the section of results and discussion.
The main contributions of this paper are summarized as follows:
- •
The implementation of a new model using a 1D CNN followed by a single bidirectional CuDNNGRU performs the classification.
- •
The use of both CNN and Bi-CUDNNGRU maximizes the potential of the text representation with the capability to generate complex content sequences with minimal storage requirements.
- •
Experiments on five commonly used datasets demonstrate that the proposed model yields remarkable computing time and precision performance with a low loss against state-of-the-art methods.
The rest of the paper is organized as follows: The second section represents the related work; the third section defines the proposed model, how it works, how the authors implemented the hybrid model and the implementation of other four models to compare them with their best model; The fourth section focuses on the problem statement, description of the datasets, and the exploratory data analysis and settings used to solve the problem; in addition, the fourth section explains in detail how the data were prepared and represented, which word emblems were used, how the dataset was divided for training and testing, and which evaluation criteria were used to assess the performance of the proposed model; The fifth section presents the results using various words embedding; the sixth section discusses the results obtained; finally, the seventh section concludes the paper.
TopState Of The Art
Many approaches have been proposed in the past few years. Johnson and Zhang's (2015b)ConvNets model or char-level CNN applies only to a character. It can work in different languages (Johnson & Zhang, 2015a). The Kim's (2014)TextCNN model uses the Word2vec (Mikolov et al., 2013). This architecture is a variant Collobert et al.'s (2011)CNN architecture; it is training only on labeled data.