Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

Rezaei, Mohammad

doi:10.1007/s00357-018-9296-4

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

Published: 24 April 2019

Volume 37, pages 352–365, (2020)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Mohammad Rezaei¹

610 Accesses
Explore all metrics

Abstract

Fast centroid-based clustering algorithms such as k-means usually converge to a local optimum. In this work, we propose a method for constructing a better clustering from two such suboptimal clustering solutions based on the fact that each suboptimal clustering has benefits regarding to including some of the correct clusters. We develop the new method COTCLUS to find two centroids from one clustering and replace them by two centroids from the other clustering so that the maximum decrease in the mean square error of the first clustering is achieved. After modifying centroids, k-means algorithm with few iterations is applied for fine-tuning. In an iterative algorithm, the resulting clustering is further improved using a new given clustering. The proposed method can find optimal clustering in a very small number of iterations for datasets with well-separated clusters. We demonstrate by experiments that the proposed method outperforms the selected competitive methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Approach for Selection of Initial Cluster Centroids for k-means

Initial Centroid Selection Method for an Enhanced K-means Clustering Algorithm

A Quality Metric for K-Means Clustering Based on Centroid Locations

References

Arthur, D., & Vassilvitskii, S. (2007). K-means++: the advantages of careful seeding. Paper Proceedings of 18th annual ACM-SIAM symposium on Discrete algorithms, pp. 1027–1035.
Brusco, M. J., & Steinley, D. (2007). A comparison of heuristic procedures for minimum within-cluster sums of squares partitioning. Psychometrika, 72(4), 583–600.
Article MathSciNet Google Scholar
Celebi, M. E., Kingravi, H. A., & Vela, P. A. (2013). A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications, 40(1), 200–210.
Article Google Scholar
de Amorim, R. C., Makarenkov, V., & Mirkin, B. (2016). A-Ward pβ: effective hierarchical clustering using the Minkowski metric and a fast k-means initialisation. Information Sciences, 370, 343–354.
Article Google Scholar
Fränti, P., & Kivijärvi, J. (2000). Randomised local search algorithm for the clustering problem. Pattern Analysis & Applications, 3(4), 358–369.
Article MathSciNet Google Scholar
Fränti, P., & Virmajoki, O. (2006). Iterative shrinking method for clustering problems. Pattern Recognition, 39(5), 761–775.
Article Google Scholar
Franti, P., Virmajoki, O., & Hautamaki, V. (2006). Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11), 1875–1881.
Article Google Scholar
Fred, A. L., & Jain, A. K. (2005). Combining multiple clusterings using evidence accumulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 835–850.
Article Google Scholar
Hansen, P., & Mladenović, N. (2001). J-means: a new local search heuristic for minimum sum of squares clustering. Pattern Recognition, 34(2), 405–413.
Article Google Scholar
Hruschka, E. R., Campello, R. J., & Freitas, A. A. (2009). A survey of evolutionary algorithms for clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 39(2), 133–155.
Article Google Scholar
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Article Google Scholar
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM computing surveys (CSUR), 31(3), 264–323.
Article Google Scholar
Klein, R. W., & Dubes, R. C. (1989). Experiments in projection and clustering by simulated annealing. Pattern Recognition, 22(2), 213–220.
Article Google Scholar
Likas, A., Vlassis, N., & Verbeek, J. J. (2003). The global k-means clustering algorithm. Pattern Recognition, 36(2), 451–461.
Article Google Scholar
Maulik, U., & Bandyopadhyay, S. (2000). Genetic algorithm-based clustering technique. Pattern Recognition, 33(9), 1455–1465.
Article Google Scholar
Rezaei, M., & Fränti, P. (2016). Set matching measures for external cluster validity. IEEE Transactions on Knowledge and Data Engineering, 28(8), 2173–2186.
Article Google Scholar
Selim, S. Z., & Alsultan, K. (1991). A simulated annealing algorithm for the clustering problem. Pattern Recognition, 24(10), 1003–1008.
Article MathSciNet Google Scholar
Steinley, D. (2003). Local optima in k-means clustering: what you don't know may hurt you. Psychological Methods, 8(3), 294–304.
Article Google Scholar
Steinley, D., & Brusco, M. J. (2007). Initializing k-means batch clustering: a critical evaluation of several techniques. Journal of Classification, 24(1), 99–121.
Article MathSciNet Google Scholar
Tzortzis, G., & Likas, A. (2014). The MinMax k-means clustering algorithm. Pattern Recognition, 47(7), 2505–2516.
Article Google Scholar
Ward, J. H., Jr. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.
Article MathSciNet Google Scholar
Zhang, T., Ramakrishnan, R., & Livny, M. (1997). BIRCH: a new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, 1(2), 141–182.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, University of Eastern Finland, Länsikatu 15, P-O-BOX 111, 80101, Joensuu, Finland
Mohammad Rezaei

Authors

Mohammad Rezaei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Rezaei.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rezaei, M. Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering. J Classif 37, 352–365 (2020). https://doi.org/10.1007/s00357-018-9296-4

Download citation

Published: 24 April 2019
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00357-018-9296-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Approach for Selection of Initial Cluster Centroids for k-means

Initial Centroid Selection Method for an Enhanced K-means Clustering Algorithm

A Quality Metric for K-Means Clustering Based on Centroid Locations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Approach for Selection of Initial Cluster Centroids for k-means

Initial Centroid Selection Method for an Enhanced K-means Clustering Algorithm

A Quality Metric for K-Means Clustering Based on Centroid Locations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation