Article Preview
TopIntroduction
The term ‘Malware’ (Kramer & Bradfield, 2010) an acronym for malicious software which contains unwanted and malicious codes, contents, scripts intentionally designed or developed for gathering information that causes loss of user’s privacy or even exploitation, gaining unauthorized access into the system etc. It usually occurs when any users unwittingly install software or packages from untrusted sites or unknown sources or even when a user unknowingly clicks on unknown URL's leads to installation of Malware into their system. There are many types of malware, so it is very imperative to identify the malware according to its types as the malwares increase day by day irrespective of the user platforms i.e. whether it is Windows / Macintosh / Android or whatever and are a threat to the security of the network. The term Malware is a broader aspect and includes several scripts or malicious codes or programs and is classified into several categories (Gupta, 2013) like: adware, bots, bugs, rootkit, spyware, Trojan horses, viruses and worms, and these schemes have presented in background section. Figure 1 (Lueg, 2017) shows increased in volume of new android malware samples from year 2012 to first quarter of year 2017.
Figure 1. Increased in volume of new android malware samples from year 2012 to first quarter of 2017
From the report, it is evidenced that the malware for the Android platform is increasing enormously in large quantities, so, to stay secure an adequate detection and analysis of malware must be carried out. On the other hand, data mining (Silwattananusarn & Tuwamsuk, 2012) is the technique that uses computational intelligence that acts on large data sets or databases to find the relationship between attributes can be used for classification tasks or can even be used to make predictions. Malware is generally detected and analyzed using two approaches, a) Static Malware Analysis (Uppal, Mehra & Verma, 2009) b) Dynamic Malware Analysis (Egele, Scholte, Kirda & Kruegel, 2012), description of both are mentioned in this paper. In this work authors have used dynamic malware detection based approach and in addition have used classification tree based data mining technique namely Decision Stump (Kohavi, 1995), J48 Decision Tree (Quinlan, 1993) and Random Tree (Brieman, 2001) to identify malware and non-malware/benign executing or running in the system based on certain feature and properties.