Article Preview
TopIntroduction
Inductive logic programming (ILP) is a hot research field of machine learning. ILP systems can automatically construct first-order rules from background knowledge and training examples (Kavurucu, Senkul, &Toroslu, 2011). First-order predicate logic representation is used in ILP, so ILP algorithms are more useful and efficient than classical machine learning algorithms. ILP systems have successfully been applied to many fields, such as identification of phenotyping (Peissig et al., 2014), the classification of webpage (Li & Guo, 2011), extracting associations from pathology excerpts (Beéck et al., 2015).
The size of the hypotheses space is very important to ILP. In fact, as long as the rules allow the appearance of function symbols, the hypotheses space will be infinite, even if the rules have up to l literals. Even if the clauses are restricted to Horn clauses in which function symbols are not allowed, the size of hypotheses space will increase exponentially with l. So, it is intractable for algorithms that execute an exhaustive search of the entire candidate rules space (Serrurier & Prade 2008). Deterministic search is adopted by most ILP systems (Li & Guo, 2012) to search the rules space, and various heuristics and syntactic biases are employed to solve the search complexity. The exploration power of the search is limited by using the biases, may easily lead to local optima. Genetic algorithm based ILP (GILP) algorithms (Chien & Chen, 2010) adopt GA to search the candidate clauses space, and the approach of stochastic search can solve the weakness of local optima of deterministic search. So, GILP systems become one of the most popular approaches of ILP.
Recently, Karaboga (Karaboga & Akay, 2009) proposed a new evolution technique named artificial bee colony (ABC) to optimize multi-modal and multi-variable continuous functions. The ABC system has a competitive performance compared with other swarm intelligent system (Akay, 2013; Ozturk, Hancer, & Karaboga, 2015; Kiran, 2015), such as ACO, PSO and GA.
ABC algorithm has fewer control parameters, it has been used to solve practical optimization problems (Cui & Gu, 2015; Fei & He, 2015; Develi, Kabalci, & Basturk, 2015; Klein, Bittencourt, & Coelho, 2015). In the investigation, a novel discrete ABC system is developed to search the hypotheses space of ILP, and propose a new ABC based ILP algorithm, name ABCILP. ABCILP employs ABC to explore candidate clauses space, and the weakness of local optima of deterministic search is overcome by the method of stochastic search. First, to make ABC suitable for searching the candidate hypotheses space of ILP, we regard each first-order predicate rule as a food source and adopt discrete operations, thus the neighborhood food sources of different bees are generated. Second, some discrete operations and algorithm, such as insert, delete, and mutate are proposed to produce new food source for the onlookers and employed bees. Third, a new fitness is proposed and adaptive strategy is adopted to determine the parameter of the new fitness. Within our knowledge, there is no paper to discuss ABC based ILP algorithms. In this sense, our work can be viewed as a start point for researchers to develop ABC-based ILP algorithms. Experimental results on four benchmark datasets show that: 1) the proposed new fitness function in this paper can more precisely measure the quality of hypothesis and can avoid generating a rule which is over-specific; 2) Compared with other algorithms, such as Aleph, MFOIL, NFOIL, KFOIL, 1BC2, the performance of ABCILP is better.
The reminder of the paper is organized as follows. In section 2, ABC system is briefly introduced. In section 3, the basic components of ABCILP is described. Section 4 discusses the experimental results. Conclusion is given in section 5.