Abstract
Clustering by fast search and find of density peaks (CFSFDP) is proposed to cluster the data by finding of density peaks. CFSFDP is based on two assumptions that: a cluster center is a high dense data point as compared to its surrounding neighbors, and it lies at a large distance from other cluster centers. Based on these assumptions, CFSFDP supports a heuristic approach, known as decision graph to manually select cluster centers. Manual selection of cluster centers is a big limitation of CFSFDP in intelligent data analysis. In this paper, we proposed a fuzzy-CFSFDP method for adaptively selecting the cluster centers, effectively. It uses the fuzzy rules, based on aforementioned assumption for the selection of cluster centers. We performed a number of experiments on nine synthetic clustering datasets and compared the resulting clusters with the state-of-the-art methods. Clustering results and the comparisons of synthetic data validate the robustness and effectiveness of proposed fuzzy-CFSFDP method.
Similar content being viewed by others
References
Li K et al (2013) Personalized multi-modality image management and search for mobile devices. Pers Ubiquitous Comput 17(8):1817–1834
Jiwen L, Erin LV, Xiuzhuang Z, Jie Z (2015) Learning compact binary face descriptor for face recognition. IEEE Trans Pattern Anal Mach Intell (TPAMI) 37(10):2041–2256
Lu J, Zhou X, Tan Y-P, Shang Y, Zhou J (2014) Neighborhood repulsed metric learning for kinshipverification. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 36(2):331–345
Lu J, Tan Y-P, Wang G (2013) Discriminative multimanifold analysis for face ecognition from a single training sample per person. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 35(1):39–51
Lu J, Liong VE, Zhou J (2015) Cost-sensitive local binary feature learning for facial ageestimation. IEEE Trans Image Process (T-IP) 24(12):5356–5368
Yan Y, Qian Y, Sharif H, Tipper D (2012) A survey on cyber security for smart grid communications. IEEE Commun Surv Tutor 14(4):998–1010
Portnoy L, Eskin E, Stolfo S (2001) Intrusion detection with unlabeled data using clustering. In: Proceedings of ACM CSS Workshop on Data Mining Applied to Security (DMSA-2001) pp 5–8
Ahn C-S, Sang-Yeob O (2014) Robust vocabulary recognition clustering model using an average estimator least mean square filter in noisy environments. Pers Ubiquitous Comput 18(6):1295–1301
Guo L, Ai C, Wang X, Cai Z, Li Y (2009) Real Time Clustering of Sensory Data in Wireless Sensor Networks. The 28th IEEE International Performance Computing and Communications Conference (IPCCC)
Yeganova L, Kim W, Kim S, Wilbur WJ (2014) Retro: concept-based clustering of biomedical topical sets. Bioinformatics 30(22):3240–3248
Xu C, Zhengchang S (2015) Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 37(10):2041–2256
Shuji S, Kakuta M, Ishida T, Akiyama Y (2015) Faster sequence homology searches by clustering subsequences. Bioinformatics 31(8):1183–1190
Shi Y, Hasan M, Cai Z, Lin G, Schuurmans D (2012) Linear coherent bi-clustering via beam searching and sample set clustering. Discrete Math Algorithms Appl 4(2):1250023
Cai Z, Heydari M, Lin G (2005) Clustering binary oligonucleotide fingerprint vectors for DNA clone classification analysis. J Comb Optim 9(2):199–211
Nicovich Philip R et al (2015) Analysis of nanoscale protein clustering with quantitative localization microscopy. Biophys J 108(2):475a
Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22(13):1658–1659
Shaw MKE (2015) K-means clustering with automatic determination of K using a Multiobjective Genetic Algorithm with applications to microarray gene expression data. Dissertation, San Diego State University
Chang M-S, Chen L-H, Hung L-J, Rossmanith P, Guan-Han W (2014) Exact algorithms for problems related to the densest k-set problem. Inf Process Lett 114(9):510–513
Kannuri L, Murty MR, Satapathy SC (2015) Partition based clustering using genetic algorithm and teaching learning based optimization: performance analysis. Adv Intell Syst Comput 338:191–200
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, no 14, pp 281–297
Park H-S, Jun C-H (2009) A simple and fast algorithm for K-medoids clustering. Expert Syst Appl 36(2):3336–3341
Lovely Sharma P, Ramya KA (2013) Review on density based clustering algorithms for very large datasets. Int J Emerg Technol Adv Eng 3(12):398–403
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34):226–231
Parimala M, Lopez D, Senthilkumar NC (2011) A survey on density based clustering algorithms for mining large spatial databases. Int J Adv Sci Technol 31(1):216–223
Shah Glory H, Bhensdadia CK, Ganatra Amit P (2012) An empirical evaluation of density-based clustering techniques. Int J Soft Comput Eng (IJSCE) 2(1):2231–2307
Liu P, Zhou D, Wu N (2007) VDBSCAN: varied density based spatial clustering of applications with noise. In: Proceedings: Service Systems and Service Management 2007, pp 1–4
Mehmood R, Zhang G, Bie R, Dawood H, Ahmad H (2016) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing. doi:10.1016/j.neucom.2016.01.102i
Birant D, Kut A (2007) ST-DBSCAN: an algorithm for clustering spatial-temporal data. Data Knowl Eng 60(1):208–221
Chen T, Zhang NL, Liu T, Poon KM, Wang Y (2012) Model-based multidimensional clustering of categorical data. Artif Intell 176(1):2246–2269
Mann AK, Kaur N (2013) Survey paper on clustering techniques. Int J Sci Eng Technol Res (IJSETR) 2(4):803–806
Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Interdiscip Rev: Data Min Knowl Discov 2(1):86–97
Chen N, Ze-shui X, Xia M (2014) Hierarchical hesitant fuzzy K-means clustering algorithm. Appl Math A J Chin Univ 29(1):1–17
Jaeger D, Barth J, Niehues A, Fufezan C (2014) pyGCluster, a novel hierarchical clustering approach. Bioinformatics 30(6):896–898
Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classif 8(3):231–255
Parikh M, Varma T (2014) Survey on different grid based clustering algorithms. Int J Adv Res Comput Sci Manag Stud 2(2):427–430
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976
Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21:32–40
Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data (TKDD) 1(1):1–30
Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform vol 8, artical no. 3
Chang H, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recognit 41(2):191–203
Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280
Franti P, Virmajoki O, Hautamaki V (2006) Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell 28(11):1875–1881
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Karkkainen I, Franti P (2002) Dynamic local search for clustering with unknown number of clusters. In: Proceedings of International Conference on Pattern Recognition, vol 16, no 2, pp 240–243
Franti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recognit 39(5):761–775
Acknowledgments
This research is sponsored by National Natural Science Foundation of China (Nos. 61171014,61371185, 61401029, 61472044, 61472403, 61571049) and the Fundamental Research Funds for the Central Universities (Nos. 2014KJJCB32, 2013NT57) and by SRF for ROCS, SEM.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bie, R., Mehmood, R., Ruan, S. et al. Adaptive fuzzy clustering by fast search and find of density peaks. Pers Ubiquit Comput 20, 785–793 (2016). https://doi.org/10.1007/s00779-016-0954-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-016-0954-4