Abstract
Rodriguez et al. published an algorithm called clustering by fast search and find of density peaks (DPC) in Science in June 2014. It can quickly search the density peaks and cluster the datasets efficiently. However, there are some drawbacks. First, the local density definition is simple for the datasets with both dense clusters and sparse clusters; the density peaks cannot be found correctly using the two local density definition methods. Second, there is poor assignment fault tolerance, if a point is misallocated, the subsequent assignment will further amplify the error, which will have a serious impact on the clustering results. To solve the problems, a new clustering method, density peak clustering based on cumulative nearest neighbors degree and micro cluster merging, is proposed. The proposed method improves the DPC algorithm in two ways, the one is that the method defines a new local density to solve the defect of the DPC algorithm; the other one is that the graph degree linkage is combined with the DPC to alleviate the problem of distribution errors. The experiments on synthetic and real-world datasets show that the proposed method outperforms DPC, DBSCAN, OPTICS, AP, K-Means and other DPC variant algorithms.
Similar content being viewed by others
Change history
19 November 2019
The Publisher regrets an error on the printed front cover of the October 2019 issue. The issue numbers were incorrectly listed as Volume 91, Nos. 10-12, October 2019. The correct number should be: "Volume 91, No. 10, October 2019"
References
Cui, Z., Zhang, J., Wang, Y., Cao, Y., Cai, X., Zhang, W., & Chen, J. (2019). A pigeon-inspired optimization algorithm for many-objective optimization problems[J]. SCIENCE CHINA Information Sciences, 62(7), 070212. https://doi.org/10.1007/s11432-018-9729-5.
Zhang, M., Wang, H., Cui, Z., et al. (2018). Hybrid multi-objective cuckoo search with dynamical local search[J]. Memetic Computing, 10(2), 199–208.
Zhihua Cui, Bin Sun, Gaige Wang, et al. A novel oriented cuckoo search algorithm to improve DV-hop performance for cyber-physical systems[J]. Journal of Parallel and Distributed Computing, 2017, 103:42–52.
Cai, X., Gao, X.-z., & Xue, Y. (2016). Improved bat algorithm with optimal forage strategy and random disturbance strategy[J]. International Journal of Bio-inspired Computation, 8(4), 205–214.
Gai, K., & Qiu, M. (2018). Blend arithmetic operations on tensor-based fully homomorphic encryption over real numbers[J]. IEEE Transactions on Industrial Informatics, 14(8), 3590–3598.
Gai, K., Choo, K. K. R., Qiu, M., & Zhu, L. (2018). Privacy-preserving content-oriented wireless communication in internet-of-things[J]. IEEE Internet of Things Journal, 5(4), 3059–3067.
Wang, G., Cai, X., Cui, Z., et al. (2017). High performance computing for cyber physical social systems by using evolutionary multi-objective optimization algorithm[J]. IEEE Transactions on Emerging Topics in Computing. https://doi.org/10.1109/TETC.2017.2703784.
Qiu, M., Gai, K., & Xiong, Z. (2018). Privacy-preserving wireless communications using bipartite matching in social big data[J]. Future Generation Computer Systems, 87, 772–781.
Wu, P., Lu, Z., Zhou, Q., Lei, Z., Li, X., Qiu, M., & Hung, P. C. K. (2019). Big data logs analysis based on seq2seq networks for cognitive internet of things[J]. Future Generation Computer Systems, 90, 477–488.
Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.-g., & Chen, J. (2018). Detection of malicious code variants based on deep learning[J]. IEEE Transactions on Industrial Informatics, 14(7), 3187–3196.
Cui, Z., Cao, Y., Cai, X., Cai, J., & Chen, J. (2017). Optimal LEACH protocol with modified bat algorithm for big data sensing systems in internet of things. Journal of Parallel and Distributed Computing. https://doi.org/10.1016/j.jpdc.2017.12.014.
Macqueen, J. (1967). Some methods for classification and analysis of multi variate observations[C]. Proceedings of Berkeley symposium on mathematical statistics &probability, 281–297.
Zhang, T., Ramakrishnan, R., & Livny, M. (1997). BIRCH: A new data clustering algorithm and its applications [J]. Data Mining and Knowledge Discovery, 1(2), 141–182.
Ester M. A. (1996). Density-based algorithm for discovering clusters in large spatial databases with noise[C]. Proceedings of the second ACM international conference on knowledge discovery and data mining, 226–231.
Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points[J]. Science, 315(5814), 972–976.
Wei, W., Yang, J., & Muntz, R. R. (1997). STING: A statistical information grid approach to spatial data mining[C]. Proceedings of the 23rd international conference on very large data bases, 186–195.
Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks[J]. Science, 344(6191), 1492–1496.
Xu, M., Li, Y., Li, R., Zou, F., & Gu, X. (2019). EADP: An extended adaptive density peaks clustering for overlapping community detection in social networks[J]. Neurocomputing, 337, 287–302.
Mehmood, R., El-Ashram, S., Bie, R., et al. (2018). Effective cancer subtyping by employing density peaks clustering by using gene expression microarray[J]. Personal and Ubiquitous Computing, 22(3), 615–619.
Liao, E., & Liu, C. (2018). A hierarchical algorithm based on density peaks clustering and ant Colony optimization for traveling salesman problem[J]. IEEE Access, 6, 38921–38933.
Zhang, W., Wang, X., & Zhao, D., et al. (2012). Graph degree linkage: Agglomerative clustering on a directed graph[C]. Proceedings of the European conference on computer vision, 428–441.
Zhou, Z., Gangquan, S., Yanbin, Z., et al. (2018). Robust clustering by identifying the veins of clusters based on kernel density estimation[J]. Knowledge-Based Systems, 159, 309–320.
Xue, X., Gan, S., Peng, H., et al. (2018). Improved density peaks clustering algorithm combining K-nearest neighbors[J]. Computer Engineering and Applications, 54(7), 36–43.
Du, M., Ding, S., & Xue, Y. (2017). A robust density peaks clustering algorithm using fuzzy neighborhood[J]. International Journal of Machine Learning & Cybernetics, 12, 1–10.
Zang, W., Ren, L., Zhang, W., & Liu, X. (2017). Automatic density peaks clustering using DNA genetic algorithm optimized data field and Gaussian process[J]. International Journal of Pattern Recognition and Artificial Intelligence, 31(8), 1750023.
Qiu, B., & Cheng, L. (2018). A parameter-free clustering algorithm based on Laplace centrality and density peaks. Journal of Computer Applications, 38(9), 2511–2514.
Yaohui, L., Zhengming, M., & Fang, Y. (2017). Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy [J]. Knowledge-Based Systems, 133, 208–220.
Xie, J., Gao, H., Xie, W., Liu, X., & Grant, P. W. (2016). Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors[J]. Information Sciences, 354, 19–40.
Ankerst, M., Breunig, M. M., & Kriegel, H.-P., et al. (1999). Optics: Ordering points to identify the clustering structure[C]. Proceedings of the ACM Sigmod Record, 49–60.
Vinh, N., Epps, J., & Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance [J]. Journal of Machine Learning Research, 11(1), 2837–2854.
Fowlkes, E. B., & Mallows, C. L. (1983). A method for comparing two hierarchical Clusterings [J]. Journal of the American Statistical Association, 78(383), 553–569.
Liu, R., Wang, H., & Yu, X. (2018). Shared-nearest-neighbor-based clustering by fast search and find of density peaks [J]. Information Sciences, 450, 200–226.
Jain, A.K., & Law, M.H. (2005). Data clustering: A user’s dilemma[C]. Proceedings of the international conference on pattern recognition and machine intelligence, : 1–10.
Chang, H., & Yeung, D.-Y. (2008). Robust path-based spectral clustering[J]. Pattern Recognition, 41(1), 191–203.
Gionis, A., Mannila, H., & Tsaparas, P. (2007). Clustering aggregation[J]. ACM Transactions on Knowledge Discovery from Data, 1(1), 1–30.
Fu, L., & Medico, E. (2007). FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data[J]. BMC Bioinformatics, 8(1), 3.
Veenman, C. J., Reinders, M. J. T., & Backer, E. (2002). A maximum variance cluster algorithm[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1273–1280.
Franti, P., Virmajoki, O., & Hautamaki, V. (2006). Fast agglomerative clustering using a k-nearest neighbor graph[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11), 1875–1881.
Frnti, P., & Virmajoki, O. (2006). Iterative shrinking method for clustering problems[J]. Pattern Recognition, 39(5), 761–775.
Bache, K., & Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml. Irvine: University of California.
Street, W. N., Wolberg, W. H., & Mangasarian, O. L. (1993). Nuclear feature extraction for breast tumor diagnosis[C]. Proceedings of the IS&T/SPIE International Symposium on Electronic Imaging:Science and Technology, 1905, 861–870.
Charytanowicz, M., Niewczas, J., Kulczycki, P., et al. (2010). Complete gradient clustering algorithm for features analysis of x-ray images [J]. Advances in Intelligent and Soft Computing, 69, 15-24.
Dias, D. B., Madeo, R. C. B., & Rocha T., et al. (2009). Hand movement recognition for brazilian sign language: A study using distance-based neural networks[C]. Proceedings of the international joint on neural networks, 697–704.
Sigillito, V. G., Wing, S. P., Hutton, L. V., et al. (1989). Classification of radar returns from the ionosphere using neural networks[J]. Johns Hopkins APL Technical Digest, 10(3), 262–266.
Breiman, L., Friedman, J., Stone, C. J., et al. (1984). Classification and regression trees[M]. Boca Raton: CRC Press.
Ding, J., He, X., Yuan, J., & Jiang, B. (2018). Automatic clustering based on density peak detection using generalized extreme value distribution[J]. Soft Computing, 22, 2777–2796.
Acknowledgments
This research was supported by the National Natural Science Foundation of China under Grant (Nos. 71433003, 51669014), the Science Fund for Distinguished Young Scholars of Jiangxi Province under Grant (No. 2018ACB21029).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, L., Zhao, J., Yao, Z. et al. Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging. J Sign Process Syst 91, 1219–1236 (2019). https://doi.org/10.1007/s11265-019-01459-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-019-01459-4