Article Preview
Top1. Introduction
Feature Subset Selection (FSS) is an important process of selecting the input variables in a supervised classification problem. It removes irrelevant, redundant, and noisy features and enhances the performance of classifiers by selecting significant features. In addition, it reduces the number of features and thereby eliminates the limitation of classification algorithms in terms of time and space complexity. FSS is more desirable for high-dimensional datasets, such as microarray data with thousands of features, otherwise enormous number of instances will be needed to build reliable models, which is computationally quite expensive. Another type of feature selection method that is computationally cheaper is feature ranking which rank features using a selection metric, such as gain ratio, symmetric uncertainty etc., and top ranked features are selected by a pre-defined threshold value. The drawback here is that it does not deal with redundant features. Some of the widely used algorithms, which deal with redundancy includes Correlation based Feature Selection method (CFS), First Correlation based Feature selection (FCBF), and Conditional Mutual Information Maximization (CMIM) (Song et al., 2013).
Essentially, FSS combines two sub-processes; (i) the search strategy for searching the subsets, and (ii) evaluating the subsets. The search strategy could be complete/exhaustive or approximate. Theoretically, the exhaustive search guarantees optimal solution but practically does not work for combinatorial optimization problems which are NP-hard (Feige & Reichman, 2004). Branch and bound (Wang, Yan & Li, 2003), Greedy Sequential (Dash & Patra, 2014) and Best-First search are few examples of exhaustive search strategy which failed to find optimal feature sets for CO problems. However, approximate search strategy develops all possible near-optimal solutions with no assurance of finding a global optimal solution (Talbi, 2009).
For the past few decades, nature-inspired meta-heuristic algorithms have attracted all attentions of researchers for finding acceptable solutions for CO problems being used as a search strategy within a reasonable period (Dash & Patra, 2013; Ke et al., 2008; Wang et. al., 2007; Dash, 2014) and considered as reliable approximation methods. Metaheuristics are broadly categorized into two types, namely neighborhood based local search (Hoos & Stutzle, 2004), and population-based global search (Back et al., 1997). Some of the representative algorithms of neighbourhood based local search are Simulated Annealing (SA), Tabu Search (TS) and population based are Firefly Algorithm (FA) (Yang, 2010; Yang, 2009), Ant Colony Optimization (ACO) (Dorigo & Birattari, 2010), the Genetic Algorithm (GA) (Deb et al., 2002), Particle Swarm Optimization (PSO) (Du et al., 2015) etc. They find an optimal solution by generating a comparatively small number of feature subsets for evaluation. The candidate subsets are then evaluated using some pre-determined evaluation measure. Moreover, the performance of the algorithm depends on the proper setting of dependent parameters and the number of iterations. The applications of FSS are profound in various fields, such as bioinformatics, text categorization, natural language processing, web page classification and much more.