Sliding Window-Based High Utility Item-Sets Mining Over Data Stream Using Extended Global Utility Item-Sets Tree

Sliding Window-Based High Utility Item-Sets Mining Over Data Stream Using Extended Global Utility Item-Sets Tree

P. Amaranatha Reddy, M. H. M. Krishna Prasad
Copyright: © 2022 |Pages: 16
DOI: 10.4018/IJSI.303579
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

High utility item-sets mining (HUIM) is a special topic in frequent item-sets mining (FIM). It gives better insights for business growth by focusing on the utility of items in a transaction. HUIM is evolving as a powerful research area due to its vast applications in many fields. Data stream processing, meanwhile, is an interesting and challenging problem since, processing very fast generating a huge amount of data with limited resources strongly demands high-performance algorithms. This paper presents an innovative idea to extract the high utility item-sets (HUIs) from the dynamic data stream by applying sliding window control. Even though certain algorithms exist to solve the same problem, they allow redundant processing or reprocessing of data. To overcome this, the proposed algorithm used a tree like structure called extended global utility item-sets tree (EGUI-tree), which is flexible to store and retrieve the mined information instead of reprocessing. An experimental study on real-world datasets proved that EGUI-tree algorithm is faster than the state-of-the-art algorithms.
Article Preview
Top

1. Introduction

Data mining techniques are useful to acquire enough knowledge to take certain decisions from the huge amounts of available data. Frequent item-sets mining (FIM) is one of the primitive data mining tasks (Agrawal, Imielinski, et al., (1993)), (Rakesh Agrawal and Ramakrishnan Srikant (1994)). It extracts the item-sets having a frequency of occurrence more than user specified cut-off from the transactional database. Most of the basic FIM algorithms are designed to work on binary transactional databases. They assume, each item in a transaction is of one unit quantity and are equally profitable. However, in real life, each item generates different profit (also called external utility) and often their purchase quantity (also called internal utility) is not the same. For example, the quantity and profit generated by the items like rice packet and gold ring are for away from each other (Dawar, Siddharth, et al., (2017)). Hence, the same kind of perception about all the items may not produce valuable results. The Utility item-sets mining (Fournier-Viger, Chun-Wei Lin, et al., (2019)) came into the picture by addressing these issues. Even though utility item-sets are more useful compared with frequent item-sets, UIM is hard and intractable. The reason is FIM methods support anti-monotone property (or downward closure property) over frequency of item-sets which prune search space effectively whereas utility item-sets do not hold the property (Krishnamoorthy, Srikumar, et al., (2015)). The property states that any superset of an infrequent item-set cannot be frequent (Tin Truong, Hai Duong, et al., (2019)).

Utility item-sets mining is created a space for new applications, those address the challenges in the areas of market basket analysis, web click stream analysis, wireless sensor networks, bio-medical data analysis, and stock market prediction (Liu, Junqiang, et al., (2016))(Reddy, Prasad, et al., (2017)). It is also been combined with other mining techniques like sequential pattern mining, episode pattern mining, stream mining, top-k high utility item-sets mining, high utility item-sets mining, high utility rare pattern mining, high average utility item-sets mining (Fournier Viger, Philippe & Lin, et al., (2017)), (Wensheng Gan, Jerry Chun-Wei Lin, et al., (2018)), (Truong-Chi, Fournier-Viger, (2019)), (Zhang, Chongsheng, et al., (2018)). This study aims to develop a high performance algorithm to retrieve high utility item-sets over data stream. It extracts item-sets having utility more than the user specified cutoff.

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024)
Volume 11: 1 Issue (2023)
Volume 10: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 9: 4 Issues (2021)
Volume 8: 4 Issues (2020)
Volume 7: 4 Issues (2019)
Volume 6: 4 Issues (2018)
Volume 5: 4 Issues (2017)
Volume 4: 4 Issues (2016)
Volume 3: 4 Issues (2015)
Volume 2: 4 Issues (2014)
Volume 1: 4 Issues (2013)
View Complete Journal Contents Listing