Abstract
For many data mining problems in order to solve them it is required to discover frequent patterns. Frequent itemsets are useful e.g. in the discovery of association and episode rules, sequential patterns and clusters. Nevertheless, the number of frequent itemsets is usually huge. Therefore, a number of lossless representations of frequent itemsets have recently been proposed. Two of such representations, namely the closed itemsets and the generators representation, are of particular interest as they can efficiently be applied for the discovery of most interesting non-redundant association and episode rules. On the other hand, it has been proved experimentally that other representations of frequent patterns happen to be more concise and more quickly extractable than these two representations even by several orders of magnitude. Hence, such concise representations seem to be an interesting alternative for materializing and reusing the knowledge of frequent patterns. The problem however arises, how to transform the intermediate representations into the desired ones efficiently and preferably without accessing the database. This article tackles this problem. As a result of investigating the properties of representations of frequent patterns, we offer a set of efficient algorithms for dataless transitioning between them.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. (1996). Fast Discovery of Association Rules. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining. AAAI, CA.
Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., and Lakhal, L. (2000). Mining Frequent Patterns with Counting Inference. ACM SIGKDD Explorations, 2(2), 66-75.
Bykowski, A. and Rigotti, C. (2001). A Condensed Representation to Find Frequent Patterns. In Proc. of PODS' 01 (pp. 267-273). Santa Barbara, CA, USA: ACM.
Boulicaut, J.F., Bykowski, A., and Rigotti, C. (2000). Approximation of Frequency Queries by Means of Free-Sets. In Proc. of PKDD 00 (pp. 75-85). Lyon, France: Springer.
Calders, T. and Goethals, B. (2002). Mining All Non-derivable Frequent Itemsets. In Proc. of PKDD' 02 (pp. 74-85). Helsinki, Finland: Springer.
Han, J. and Kamber, M. (2000). Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann Publishers.
Harms, S.K., Deogun, J., Saquer, J., and Tadesse, T. (2001). Discovering Representative Episodal Association Rules from Event Sequences Using Frequent Closed Episode Sets and Event Constraints. In Proc. of ICDM' 01 (pp. 603-606). San Jose, CA, USA: IEEE Computer Society.
Kryszkiewicz, M. (2001a). Closed Set Based Discovery of Representative Association Rules. In Proc. of IDA' 01 (pp. 350-359). Cascais, Portugal: Springer.
Kryszkiewicz, M. (2001b). Concise Representation of Frequent Patterns based on Disjunction-free Generators. In Proc. of ICDM' 01 (pp. 305-312). San Jose, CA, USA: IEEE Computer Society.
Kryszkiewicz, M. (2002a). Concise Representations of Association Rules. In Proc. of ESF' 02 (pp. 92-109). London, UK: Springer.
Kryszkiewicz, M. (2002b). Concise Representation of Frequent Patterns And Association Rules, Habilitation Thesis, Publishing House of the Warsaw University of Technology.
Kryszkiewicz, M. (2002c). Converting Generators Representation and Closed Itemset Representations of Frequent Patterns. ICS Research Report, No. 27, Warsaw University of Technology.
Kryszkiewicz, M. and Gajek, M. (2002a). Concise Representation of Frequent Patterns based on Generalized Disjunction-Free Generators. In Proc. of PAKDD' 02 (pp. 159-171). Taipei, Taiwan: Springer.
Kryszkiewicz, M. and Gajek, M. (2002b). Why to Apply Generalized Disjunction-Free Generators Representation of Frequent Patterns? In Proc. of ISMIS' 02 (pp. 383-392). Lyon, France: Springer.
Pasquier, N. (2000). DM: Algorithmes d'extraction et de Réduction des Règles d'association dans les Bases de Données, Ph.D., Univ. B. Pascal-Clermont-Ferrand II.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999a). Discovering Frequent Closed Itemsets for Association Rules. In Proc. of ICDT' 99 (pp. 398-416). Jerusalem, Israel: Springer.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999b). Efficient Mining of Association Rules Using Closed Itemset Lattices. Journal of Information Systems, 24(1), 25-46.
Pei, J., Han, J., and Mao, R. (2000). CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. In Proc. of DMKD'00 (pp. 21-30). Dallas, TX, USA: ACM.
Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., and Lakhal, L. (2000). Fast Computation of Concept Lattices Using Data Mining Techniques. In Proc. of KRDB' 00 (pp. 129-139). Berlin, Germany.
Zaki, M.J. and Hsiao, C.J. (2002). CHARM: An Efficient Algorithm for Closed Itemset Mining. In Proc. of 2nd SIAM Int. Conf. on Data Mining, Arlington.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kryszkiewicz, M., Rybiński, H. & Gajek, M. Dataless Transitions Between Concise Representations of Frequent Patterns. Journal of Intelligent Information Systems 22, 41–70 (2004). https://doi.org/10.1023/A:1025828729955
Issue Date:
DOI: https://doi.org/10.1023/A:1025828729955