Article Preview
Top1. Introduction
Breast cancer (BC) is a common type of cancer that affects females. It strikes almost 10% of the women at some phase of their life. The causes of breast cancer are not fully known (Konneworleans, n.d.). However, researchers have identified a number of factors (called a potential risk) that raise (or lower) the likelihood of suffering breast cancer. Despite the fact that this form of cancer is the one of the leading causes of cancer death in women, the survival rate is considerable high. In the case of early detection, over than 97 percent of females may live for more than 5 years. (Kharya, 2012). Since the last two decades, because of the large focus on research associated with the cancer, some unorthodox and unfamiliar methods for the early identification and exploration have been flourished that help to decline the death rate associated with the cancer. The application of various procedures to significant available data in order to estimate the lastingness of any patient affected by a disease over a length of time is known to as survival analysis in medical prognosis. Mostly with growing use of technology augmented by computerised and programmed tools (Delen et al., 2005), massive amounts of medical information are now being amassed and made accessible to the a wide range of medical research communities in order to create multiple kinds of prediction models for raising the long-term effectiveness of medical research. As a result, emerging research routes including certain knowledge discovery in databases (KDD), that also uses data mining algorithms (Delen et al., 2005), have become well-known tools for medical researchers who want to find and use the arrangement and connections across a wide range of variables in order to determine a outcome of a type of cancer utilising cached databases. In past years, data mining has now become a valuable tool for extracting and manipulating data and also designing pattern arrangements to generate information for effective decision. Data mining is the method with filtering, investigating, as well as prototyping a large amount of data to discover consistency, uniformity, or correspondence that was previously unknown in order to produce effective and excellent results for the database (Kataria & Sharma, 2013). In other words, data mining refers to self-regulatory analysis of enormous databases that are valid, novel, useful and understandable. Data mining has emerged as an aid to supply keys to analysts’ problems.
Commonly used data mining approaches are decision trees (Teli & Kanikar, 2015), logistic regression (Komarek, 2004), support vector machines (Wang, 2005), k-NN (Cai et al., 2010) and artificial neural networks (Arockiaraj, 2013) etc.
Automatic (machine) identification, interpretation, categorization, and pattern clustering are important methods that have applications in a wide range of fields, including engineering and science subject areas like biology, psychology, medicine, marketing, computer vision, artificial intelligence, and remote sensing, among many others (Jain et al., 2000). The data acquired through research paves the way for feature extraction in such techniques and nominates the common characteristics of a number of such applications where these features are not extracted by domain experts. The availability of technologies having features of higher computing power and faster processing of huge data sets have made it possible to diversify the techniques for data analysis as well as classification. In many of the emerging applications, several intermixed approaches are used for the optimal classification. That’s why, integrating many of the sensing approaches and classifiers is much frequently used exercise in pattern recognition (Jain et al., 2000).
Commonly used pattern recognition techniques are Decision Trees (Patel & Rana, 2014), Logistic Regression (Turkov et al., 2012), SVM (Wang, 2005), k-NN (Suguna & Thanushkodi, 2010), Random Forests (Ghosal, 2009) and ERCF (Moosmann et al., 2008) etc.