Article Preview
Top1. Introduction
Data mining refers to the discovery of potentially useful hidden knowledge in huge amounts of data. Frequent itemset mining is a major domain of data mining that plays an important role in extracting meaningful information. The goal of Frequent Itemset Mining (FIM) is to find frequently appearing subsets within a database of sets. Important application areas are machine learning, web log mining, information retrieval, business intelligence, and many more. As a result, frequent itemset mining over data streams has been one of the issues receiving the most attention in the data mining research areas.
With the development of modern society, the size of various datasets has been increasing tremendously in recent years as speedups in processing and communication have greatly improved the capability for data processing in all areas. Consequently, identifying important and meaningful information has become much more complex than before. One of the more challenging problems in data mining is discovering association rules from large databases of transactions where each transaction consists of a set of items. Association rules mining (Agawal et al., 1993; 1994) determines relations among itemsets in a database. The effectiveness of this technique is determined by quickly and correctly finding interesting correlation relationships between items in large databases. Because of its significance in many applications, a number of/numerous revised algorithms have been introduced, and yet, association rule mining is still in need of more research. The mining of association rules includes two sub procedures, (1) candidate generating and (2) finding all frequent itemsets that appear more often than a minimum support threshold would allow. Applying the results of data mining to the planning of a company’s strategy could effectively increase the profit and reduce the risks.
In the digital field, the technology in computer hardware architecture has been revolutionized by expanding main memory and evolving processors from single-core to multi-core, many-core or even cloud systems (Grossman et al., 2008; Hu, 2012; Meenakshi et al., 2010; Suneetha et al., 2011; Zhou et al., 2010). Previously, the traditional sequential data mining algorithm (Fakhrahmad et al., 2011; Jin, 2009; Prakash et al., 2010; Yu et al., 2010; Yun et al., 2005) would take a tremendous amount of time in handling large datasets. These algorithms have not kept up to date with the latest computer architectures and relatively little effort has been devoted to mapping these algorithms to/for high-performance platforms.