Article Preview
Top1. Introduction
Research trends in Machine Learning include investigations on the most promising algorithm for a given data set. Most prediction tasks can be implemented using diverse set of algorithms. These can be arranged based on their prescient tasks. Decision Tree algorithm, Random Forest algorithm, Naïve Bayes analyzer, Artificial Neural Network, Linear Regression, Logistic Regression, Support Vector algorithm and K-Nearest Neighbor algorithm are a few of them to perform classification, clustering, regression, association rule mining etc.. A substantial research effort has been exerted on these algorithms to make better decisions related to the choice of algorithms (Li Congcong et. al, 2013).
In practice, researchers would analyze the presentation of the chosen algorithms on a test data set and select the algorithm that actually outperforms the others in a significant manner (P.K. Douglas et. al, 2016 and Ladds et. al, 2017). However, there is still the inherent uncertainty of whether a chosen algorithm will be the most suitable for all real- world datasets. As expressed in the “No free lunch theorem” the computational expense of finding an answer, arrived at the midpoint of overall issues in the class, is the equivalent for any arrangement strategy (Wolpert David & Macready William, 1996). Classifier combination strategies such as, boosting and bagging have outperformed solitary best classifiers on many real-world datasets (Syarif, Iwan et. al, 2012). Hence, when none of the classification algorithms fundamentally beats different techniques, it is pragmatic to choose a couple of algorithms and to decide the best during runtime (Dietterich T.G, 2000).
From a mathematical perspective, a classification algorithm is a sophisticated fit to a non-linear function, and a solitary machine learning model may fit well to a certain dataset. However it may overfit or underfit to some different datasets. Thus, the prediction accuracy of a solitary model may arrive at the upper limit even with ideal parameters. One potential technique to overcome the limitation of a single algorithm is to join a few algorithms to break through the upper limit of a single learning algorithm which is called as an ensemble. Bagging, Boosting and Stacking are three types of ensembles. Stacking/ Stacked generalization is an ensemble strategy that utilizes a higher-level model to join lower-level sub-models to accomplish higher prediction accuracy. Unlike bagging and boosting approaches that consolidate classifiers of a similar kind, the stacked generalization can join diverse algorithms through a meta- learning model to expand the accuracy (Ting, K. M, 1999). It is an ensemble learning approach where the ensemble model could yield superior predictive performance than any of the constituent lower-level sub-models.