Abstract
The purpose of mining frequent itemsets is to identify the items in groups that always appear together and exceed the user-specified threshold of a transaction database. However, numerous frequent itemsets may exist in a transaction database, hindering decision making. Recently, the mining of frequent closed itemsets has become a major research issue because sets of frequent closed itemsets are condensed yet complete representations of frequent itemsets. Therefore, all frequent itemsets can be derived from a group of frequent closed itemsets. Nonetheless, the number of transactions in a transaction database can increase rapidly in a short time period, and a number of the transactions may be outdated. Thus, frequent closed itemsets may be changed with the addition of new transactions or the deletion of old transactions from the transaction database. Updating previously closed itemsets when transactions are added or removed from the transaction database is challenging.
This study proposes an efficient algorithm for incrementally mining frequent closed itemsets without scanning the original database. The proposed algorithm updates closed itemsets by performing several operations on the previously closed itemsets and added/deleted transactions without searching the previously closed itemsets. The experimental results show that the proposed algorithm significantly outperforms previous methods, which require a substantial length of time to search previously closed itemsets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adnan M, Alhajj R (2009) DRFP-tree: disk resident frequent pattern tree. Appl Intell 30(2):84–97
Agrawal R, Srikant R (1994) Fast algorithm for mining association rules. In: Proc of international conference on very large data bases, pp 487–499
Cheng J, Ke Y, Ng W (2008) A survey on algorithms for mining frequent itemsets over data streams. Knowl Inf Syst 16(1):1–27
Chi Y, Wang H, Yu PS, Muntz RR (2004) Moment: maintaining closed itemsets over a stream sliding window. In: Proc of 2004 IEEE international conference on data mining, pp 59–66
Donga J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl-Based Syst 20(4):329–335
Giannella C, Han J, Pei J (2003) Mining frequent patterns in data streams at multiple time granularities. In: Proceedings of next generation data mining. MIT Press, Cambridge, pp 91–212
Han J, Mao R, Pei J, Yin Y (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86
Hou W, Yang B, Zhou Z, Wu C (2008) An adaptive frequent itemset mining algorithm for data stream with concept drifts. In: Proc of CSSE 2008, pp 382–385
Jiang N, Gruenwald L (2006) CFI-stream: mining closed itemsets in data streams. In: Proc of 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 592–597
Jin R, Agrawal G (2005) An algorithm for in-core frequent itemset mining on streaming data. In: Proc of 5th IEEE international conference on data mining, pp 210–217
Koh JL, Shieh SF (2004) An efficient approach for maintaining association rules based on adjusting FP-tree structures. In: Proceedings of the 12th international conference on database systems for advanced applications (DASFAA), pp 417–424
Li H-F, Lee S-Y (2009) Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst Appl, Part 1 36(2):1466–1477
Liu B, Ma Y, Wong CK, Yu PS (2003) Scoring the data using association rules. Appl Intell 18(2):119–135
Manku GS, Motwani R (2002) Approximate frequency counts over data streams. In: Proc of 28th international conference on very large data bases, pp 346–357
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proc of 7th international conference on database theory, pp 398–416
Raïssi C, Poncelet P, Teisseire M (2007) Towards a new approach for mining frequent itemsets on data stream. J Intell Inf Syst 28(1):23–36
Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proc of 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 236–245
Yen SJ, Lee YS, Gu JY (2012) An efficient approach for updating the structure for mining frequent patterns. In: Proc of IEEE international conference on industrial engineering and engineering management, December 2012
Yen SJ, Wang CK, Ouyang LY (2012) A search space algorithm for mining frequent patterns. J Inf Sci Eng 28(1):177–191
Zaki MJ, Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Proc of SIAM international conference on data mining, pp 99–104
Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36
IBM Almaden. Quest synthetic data generation code. http://www.almaden.ibm.com/cs/quest/syndata.html
Microsoft Developer Network (MSDN). http://msdn.microsoft.com/en-us/library/aa217032(v=sql.80).asp
Bache K, Lichman M (2013) UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yen, SJ., Lee, YS. & Wang, CK. An efficient algorithm for incrementally mining frequent closed itemsets. Appl Intell 40, 649–668 (2014). https://doi.org/10.1007/s10489-013-0487-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-013-0487-8