Abstract
A key stage in the discovery of Association Rules in binary databases involves the identification of the “frequent sets”, i.e. those sets of attributes that occur together often enough to invite further attention. This stage is also the most computationally demanding, because of the exponential scale of the search space. Particular difficulty is encountered in dealing with very densely-populated data. A special case of this is that of, for example, demographic or epidemiological data, which includes some attributes with very frequent instances, because large numbers of sets involving these attributes will need to be considered. In this paper we describe methods to address this problem, using methods and heuristics applied to a previously-presented generic algorithm, Apriori-TFP. The results we present demonstrate significant performance improvements over the original Apriori-TFP in datasets which include subsets of very frequently-occurring attributes.
Chapter PDF
Similar content being viewed by others
References
Agarwal, R., Aggarwal, C. and Prasad, V. Depth First Generation of Long Patterns. Proc ACM KDD 2000 Conference, Boston, 108–118, 2000
Agrawal, R. Imielinski, T. Swami, A. Mining Association Rules Between Sets of Items in Large Databases. SIGMOD-93, 207–216. May 1993
Agrawal, R. and Srikant, R. Fast Algorithms for Mining Association Rules. Proc 20th VLDB Conference, Santiago, 487–499. 1994
Bayardo, R. J. Efficiently Mining Long Patterns from Databases. Proc ACMSIGMOD Int Conf on Management of Data, 85–93, 1998
Bayardo, R. J., Agrawal, R. and Gunopolos, D. Constraint-based rule mining in large, dense databases. Proc 15th Int Conf on Data Engineering, 1999
Berry, M. J. and Lino., G. S. Data Mining Techniques for Marketing, Sales and Customer Support. John Wiley and sons, 1997
Brin, S., Motwani. R., Ullman, J. D. and Tsur, S. Dynamic itemset counting and implication rules for market basket data. Proc ACM SIGMOD Conference, 255–256, 1997
Coenen, F. and Leng, P. Optimising Association Rule Algorithms Using Itemset Ordering. In Research and Development in Intelligent Systems XVIII, (Proc ES2001), eds M. Bramer, F Coenen and A Preece, Springer, Dec 2001, 53–66
Goulbourne, G., Coenen, F. and Leng, P. Algorithms for Computing Association Rules using a Partial-Support Tree. J. Knowledge-Based Systems 13 (2000), 141–149. (also Proc ES’99.) 101
Han, J., Pei, J. and Yin, Y. Mining Frequent Patterns without Candidate Generation. Proc ACM SIGMOD 2000 Conference, 1–12, 2000
Liu, B., Hsu, W. and Ma, Y. Mining association rules with multiple minimum supports. Proc. KDD-99, ACM, 1999, 337–341
Coenen, F., Goulbourne, G. and Leng, P. Computing Association Rules Using Partial Totals. Proc PKDD 2001, eds L. De Raedt and A Siebes, LNAI 2168, August 2001, 54–66
Rymon, R. Search Through Systematic Set Enumeration. Proc. 3rd Int’l Conf. on Principles of Knowledge Representation and Reasoning, 1992, 539–550
Savasere, A., Omiecinski, E. and Navathe, S. An efficient algorithm for mining association rules in large databases. Proc 21st VLDB Conference, Zurich, 432–444. 1995
Toivonen, H. Sampling large databases for association rules. Proc 22nd VLDB Conference, 134–145. Bombay, 1996
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Coenen, F., Leng, P. (2002). Finding Association Rules with Some Very Frequent Attributes. In: Elomaa, T., Mannila, H., Toivonen, H. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2002. Lecture Notes in Computer Science, vol 2431. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45681-3_9
Download citation
DOI: https://doi.org/10.1007/3-540-45681-3_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44037-6
Online ISBN: 978-3-540-45681-0
eBook Packages: Springer Book Archive