Article Preview
Top1. Introduction
Since performance degradation, rounding error, wrong results for request, and even abrupt downtime were found in web server (Yan, 2006), operating system, communication system, android system, and cloud computing system, a method called rejuvenation has been proposed to counter these problems which are also called as software aging problems. By cleaning the abnormal states, and restart the parts of running software system, rejuvenation method can make software system robust. Since software aging problems are induced by Mandelbug (Grottke, 2007) which can be hardly tested and removed during the development and testing stage, fault-tolerant technique, such as Multiple Replicas methods, cannot solve the software aging problems, even result in more serious consequences. In order to rejuvenate the software system at the right moment, some methods are proposed, such as semi-Markov reward mode (Vaidyanathan, 2005), Petri nets (Volovoi, 2004). Moreover, these proposed methods can be classified into two methods: model-based and measurement-based. Because of the extra cost by the first methods, some measurement-based methods are proposed and implemented, especially time series algorithms and machine learning algorithms. In measurement-based methods, some aging parameters are used to indicate software aging occurrence, such as physical memory, swap space, CPU utilization. To discover software aging in advance, prediction methods are used to forecast resource consumption parameters and performance parameters.
Forecast of resource consumption has attracted the attention of academicians as well as industry. Since the resource consumption series, such as available memory in operating system level, are full of noise and non-stationarity, it is a challenging task for most researchers to predict resource consumption accuracy. Forecast of resource consumption suffered from software aging problems can be subdivided into linear methods and nonlinear methods. Linear methods contain autoregressive integrated moving average (ARIMA), exponential smoothing, stochastic volatility. For an example of ARIMA, as one of the most popular methods, ARIMA has been used for four decades in many areas. ARIMA has some assumptions for data series: data need to be stationary, no missing data, and no much disturbance. If these assumptions are violated, the performance of ARIMA may get worse. Nonlinear methods contain artificial neuron networks (ANNs), support vector machines, genetic algorithm, and ant colony algorithm. When the sequence includes nonlinear characteristics, nonlinear methods can have a better prediction performance than linear methods (Chen, 2013). With ANNs (Hornik, 1989), they have some merits that make them more used than other methods. Firstly, ANNs can approach any continuous function with enough accuracy. Secondly, as nonparametric and data-based models, ANNs impose some assumptions on the underlying data distribution, which makes ANNs more robust to disturbance problems than other methods. Thirdly, ANNs have generalization ability even in a non-stationary data series. Although ANNs have the above advantages, they present inconsistency in some scenes. For instance, if periodic data are used as training and testing data (Hann, 1996), ANNs have almost the same performance than others. And if the data only own linear feature without too much nonlinear part, ANNs have a worse performance than linear methods (Zhang, 1998).