Mining Interesting Patterns Using Estimated Frequencies from Subpatterns and Superpatterns | SpringerLink
Skip to main content

Mining Interesting Patterns Using Estimated Frequencies from Subpatterns and Superpatterns

  • Conference paper
Discovery Science (DS 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2843))

Included in the following conference series:

Abstract

In knowledge discovery in databases, the number of discovered patterns is often too enormous for human to understand, so that filtering out less important ones is needed. For this purpose, a number of interestingness measures of patterns have been introduced, and conventional ones evaluate a pattern as how its actual frequency is higher than the predicted values from its subpatterns. These measures may assign high scores to not only a pattern consisting of a set of strongly correlated items but also its subpatterns, and in many cases it is unnecessary to select all these subpatterns as interesting. To reduce this redundancy, we propose a new approach to evaluation of interestingness of patterns. We use a measure of interestingness which evaluates how the actual frequency of a pattern is higher than the predicted not only from its subpatterns but also from its superpatterns. On the strength of adding an estimation from superpatterns, our measure can more powerfully filter out redundant subpatterns than conventional measures. We discuss the effectiveness of our interestingness measure through a set of experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Int’l Conference on Very Large Databases, VLDB (1994)

    Google Scholar 

  2. Blake, C., Merz, C.: UCI repository of machine learning databases, University of California, Irvine, Dept. of Information and Computer Sciences (1998), http://www.ics.uci.edu/mlearn/MLRepository.html

  3. Burke, R.: Entree chicago recommendation data, University of California, Irvine Department of Information and Computer Science Irvine, CA 92697 (2000)

    Google Scholar 

  4. Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, p. 74. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Dong, G., Li, J.: Interestingness of discovered association rules in terms of neighborhood-based unexpectedness. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  6. Hettich, S., Bay, S.: The UCI KDD archive, University of California, Irvine, Dept. of Information and Computer Sciences (1999), http://kdd.ics.uci.edu

  7. Hilderman, R.J., Hamilton, H.J.: Knowledge Discovery and Measures of Interest. Kluwer Academic Publishers, Dordrecht (2001)

    MATH  Google Scholar 

  8. Hussain, F., Liu, H., Lu, H.: Relative measure for mining interesting rules. In: Proc. of PKDD 2000 Workshop on Knowledge Management Theory and Applications (2000)

    Google Scholar 

  9. Jaroszewicz, S., Simovici, D.A.: A general measure of rule interestingness. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, p. 253. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Jaroszewicz, S., Simovici, D.A.: Pruning redundant association rules using maximum entropy principle. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, p. 135. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Joshi, M., Karypis, G., Kumar, V.: A universal formulation of sequential patterns. In: Proc. of the KDD 2001 workshop on Temporal Data Mining (2001)

    Google Scholar 

  12. Michie, D., Spiegelhalter, D., Taylor, C.: The StatLog datasets, Esprit Project 5170 StatLog (1991-1994) (1994), http://www.ncc.up.pt/liacc/ML/statlog/

  13. Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, Springer, Heidelberg (1996)

    Google Scholar 

  14. Tan, P.-N., Kumar, V.: Interestingness measures for association patterns: A perspective. Technical Report TR00-036, Department of Computer Science, University of Minnesota (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yoshida, Y., Ohta, Y., Kobayashi, K., Yugami, N. (2003). Mining Interesting Patterns Using Estimated Frequencies from Subpatterns and Superpatterns. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39644-4_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20293-6

  • Online ISBN: 978-3-540-39644-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics