Hybrid Minimal Spanning Tree and Mixture of Gaussians Based Clustering Algorithm

Vathy-Fogarassy, Agnes; Kiss, Attila; Abonyi, Janos

doi:10.1007/11663881_18

Agnes Vathy-Fogarassy¹⁸,
Attila Kiss¹⁹ &
Janos Abonyi²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3861))

Included in the following conference series:

International Symposium on Foundations of Information and Knowledge Systems

292 Accesses

Abstract

Clustering is an important tool to explore the hidden structure of large databases. There are several algorithms based on different approaches (hierarchical, partitional, density-based, model-based, etc.). Most of these algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. In this paper we introduce a new clustering algorithm based on the sinergistic combination of the hierarchial and graph theoretic minimal spanning tree based clustering and the partitional Gaussian mixture model-based clustering algorithms. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the Gaussian mixture based clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Clustering with Minimum Spanning Trees: How Good Can It Be?

Article Open access 06 July 2024

Fast Minimum Spanning Tree Based Clustering Algorithms on Local Neighborhood Graph

A minimum spanning tree based partitioning and merging technique for clustering heterogeneous data sets

Article 22 April 2020

References

Augustson, J.G., Minker, J.: An analysis of some graph theoretical clustering techniques. J. ACM 17(4), 571–588 (1970)
Article MATH Google Scholar
Backer, F.B., Hubert, L.J.: A graph-theoretic approach to goodness-of-fit in complete-link hierarchical clustering. J. Am. Stat. Assoc. 71, 870–878 (1976)
Article Google Scholar
Barrow, J.D., Bhavsar, S.P., Sonoda, D.H.: Minimal spanning trees, filaments and galaxy clustering. Monthly Notices of the Royal Astronomical Society 216, 17–35 (1985)
Google Scholar
Ben-Dor, A., Yakhini, Z.: Clustering gene expression patterns. In: Proceedings of the 3rd Annual International Conference on Computational Molecular Biology (RECOMB 1999), pp. 11–14 (1999)
Google Scholar
Bezdek, J.C., Clarke, L.P., Silbiger, M.L., Arrington, J.A., Bensaid, A.M., Hall, L.O., Murtagh, R.F.: Validity-guided (re)clustering with applications to image segmentation. IEEE Transactions on Fuzzy Systems 4, 112–123 (1996)
Article Google Scholar
Castellano, G., Fanelli, A.M., Mencar, C.: A fuzzy clustering approach for mining diagnostic rules. In: Proceedings of 2003 IEEE International Conference on Systems, Man & Cybernetics (IEEE SMC 2003), vol. 1, pp. 2007–2012 (2003)
Google Scholar
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing. Journal of Knowledge Information Systems 1(1), 5–32 (1999)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological) 39(1), 1–38 (1977)
MATH MathSciNet Google Scholar
Dubes, R.C.: How many clusters are best? – an experiment. Pattern Recogn. 20(6), 645–663 (1987)
Article Google Scholar
Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. Journal Cybernetics 4, 95–104 (1974)
Article MathSciNet Google Scholar
Forina, M., Oliveros, C., Concepcion, M., Casolino, C., Casale, M.: Minimum spanning tree: ordering edges to identify clustering structure. Analytica Chimica Acta 515, 43–53 (2004)
Article Google Scholar
Gath, I., Geva, A.B.: Unsupervised Optimal Fuzzy Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 11, 773–781 (1989)
Article Google Scholar
Gonzáles-Barrios, J.M., Quiroz, A.J.: A clustering procedure based on the comparsion between the k nearest neighbors graph and the minimal spanning tree. Statistics & Probability Letter 62, 23–34 (2003)
Article Google Scholar
Gotlieb, C.C., Kumar, S.: Semantic Clustering of Index Terms. J. ACM 15(4), 493–513 (1968)
Article Google Scholar
Gower, J.C., Ross, G.J.S.: Minimal Spanning Trees and Single Linkage Cluster Analysis. Applied Statistics 18, 54–64 (1969)
Article MathSciNet Google Scholar
Heer, J., Chi, E.: Identification of Web user traffic composition using multimodal clustering and information scent. In: 1st SIAM ICDM, Workshop on Web Mining, Chicago, IL, pp. 51–58 (2001)
Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall advanced reference series. Prentice-Hall, Inc., Englewood Cliffs (1988)
MATH Google Scholar
Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. American Mathematical Society 7, 48–50 (1956)
Article MathSciNet Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, Inc., New York (1997)
MATH Google Scholar
Päivinen, N.: Clustering with a minimum spanning tree of scale-free-like structure. Pattern Recognition Letters 26(7), 921–930 (2005)
Article Google Scholar
Prim, R.: Shortest connection networks and some generalizations. Bell System Technical Journal 36, 1389–1401 (1957)
Google Scholar
Raghavan, V.V., Yu, C.T.: A comparison of the stability characteristics of some graph theoretic clustering methods. IEEE Trans. Pattern Anal. Mach. Intell. 3, 393–402 (1981)
Article MATH Google Scholar
Sander, J., Ester, M., Kriegel, H.-P., Xu, X.: Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery 2(2), 169–194 (1998)
Article Google Scholar
Varma, S., Simon, R.: Iterative class discovery and feature selection using Minimal Spanning Trees. BMC Bioinformatics 5, 126–134 (2004)
Article Google Scholar
Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transaction on Computers C20, 68–86 (1971)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computing, University of Veszprem, P.O. Box 158, Veszprém, H-8201, Hungary
Agnes Vathy-Fogarassy
Department of Information Systems, Eötvös Lóránd University, P.O. Box 120, H-1518, Budapest, Hungary
Attila Kiss
Department of Process Engineering, University of Veszprem, P.O. Box 158, Veszprém, H-8201, Hungary
Janos Abonyi

Authors

Agnes Vathy-Fogarassy
View author publications
You can also search for this author in PubMed Google Scholar
Attila Kiss
View author publications
You can also search for this author in PubMed Google Scholar
Janos Abonyi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Clausthal University of Technology, Julius-Albert-Str. 4, 38678, Clausthal-Zellerfeld, Germany
Jürgen Dix
Department of Computing Science, Umeå University, SE-901 87, Umeå, Sweden
Stephen J. Hegner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vathy-Fogarassy, A., Kiss, A., Abonyi, J. (2006). Hybrid Minimal Spanning Tree and Mixture of Gaussians Based Clustering Algorithm. In: Dix, J., Hegner, S.J. (eds) Foundations of Information and Knowledge Systems. FoIKS 2006. Lecture Notes in Computer Science, vol 3861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11663881_18

Download citation

DOI: https://doi.org/10.1007/11663881_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31782-1
Online ISBN: 978-3-540-31784-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hybrid Minimal Spanning Tree and Mixture of Gaussians Based Clustering Algorithm

Abstract

Access this chapter

Preview

Similar content being viewed by others

Clustering with Minimum Spanning Trees: How Good Can It Be?

Fast Minimum Spanning Tree Based Clustering Algorithms on Local Neighborhood Graph

A minimum spanning tree based partitioning and merging technique for clustering heterogeneous data sets

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Hybrid Minimal Spanning Tree and Mixture of Gaussians Based Clustering Algorithm

Abstract

Access this chapter

Preview

Similar content being viewed by others

Clustering with Minimum Spanning Trees: How Good Can It Be?

Fast Minimum Spanning Tree Based Clustering Algorithms on Local Neighborhood Graph

A minimum spanning tree based partitioning and merging technique for clustering heterogeneous data sets

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation