Improved Spectral Clustering Algorithm Based on Similarity Measure

Yan, Jun; Cheng, Debo; Zong, Ming; Deng, Zhenyun

doi:10.1007/978-3-319-14717-8_50

Jun Yan²²,
Debo Cheng²³,
Ming Zong²³ &
…
Zhenyun Deng²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8933))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

Abstract

Aimed at the Gaussian kernel parameter σ sensitive issue of the traditional spectral clustering algorithm, this paper proposed to utilize the similarity measure based on data density during creating the similarity matrix, inspired by density sensitive similarity measure. Making it increase the distance of the pairs of data in the high density areas, which are located in different spaces. And it can reduce the similarity degree among the pairs of data in the same density region, so as to find the spatial distribution characteristics complex data. According to this point, we designed two similarity measure methods, and both of them didn’t introduce Gaussian kernel function parameter σ. The main difference between the two methods is that the first method introduces a shortest path, while the second method doesn’t. The second method proved to have better comprehensive performance of similarity measure, experimental verification showed that it improved stability of the entire algorithm. In addition to matching spectral clustering algorithm, the final stage of the algorithm is to use the k-means (or other traditional clustering algorithms) for the selected feature vector to cluster, however the k-means algorithm is sensitive to the initial cluster centers. Therefore, we also designed a simple and effective method to optimize the initial cluster centers leads to improve the k-means algorithm, and applied the improved method to the proposed spectral clustering algorithm. Experimental results on UCI datasets show that the improved k-means clustering algorithm can further make cluster more stable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 5719; Price includes VAT (Japan)

Softcover Book: JPY 7149; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Modified Spectral Clustering Algorithm Based on Density

A novel spectral clustering algorithm based on neighbor relation and Gaussian kernel function with only one parameter

Article 31 October 2023

An improved density-based adaptive p-spectral clustering algorithm

Article 21 November 2020

References

Ding, C., He, X.: k-Nearest-Neighbor consistency in data clustering: Incorporating local information into global optimization. In: ACM Symposium on Applied Computing, pp. 584–589 (2004)
Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data vis the EM algorithm. Journal of Royal Statistical Society Series B 39(1), 1–38 (1997)
MathSciNet Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. ACM SIGMOD Record 27(2), 73–84 (1998)
Article Google Scholar
Gelbard, R., Goldman, O., Spiegler, I.: Investigating diversity of clustering methods: An empirical comparison. Data & Knowledge Engineering, 155–156 (2007)
Google Scholar
Huang, Z.: Extensions to the k-means algorithm for clustering large datasets with categorical values. Data Mining and Knowledge Discovery 2, 283–304 (1998)
Article Google Scholar
Jain, A.: Data clustering: 50 years beyond k-means. In: ICPR, pp. 651–666 (2010)
Google Scholar
Michael, K., Joyce, C.: Clustering categorical data sets using tabu search techniques. Pattern Recognition 35, 2783–2790 (2002)
Article MATH Google Scholar
Queen, J.M.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkley Symposium Math. Stat. Prob., vol. 1, pp. 281–297 (1967)
Google Scholar
Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: Semi-parametric optimization for missing data imputation. Appl. Intell. 27(1), 79–88 (2007)
Article MATH Google Scholar
Sun, Y., Zhu, Q., Chen, Z.: An iterative initial-points refinement algorithm for categorical data clustering. Pattern Recognition Letters 23, 875–884 (2002)
Article MATH Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining partitioning‘s. Journal of Machine Learning Research 3, 583–617 (2002)
MathSciNet Google Scholar
Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: ICML, pp. 577–584 (2001)
Google Scholar
Wang, L., Bo, L., Jiao, L.: Density-Sensitive Semi-Supervised Spectral Clustering. Journal of Software 18(10), 2412–2422 (2007)
Article Google Scholar
Wang, L., Bo, L., Jiao, L.: Density-Sensitive Spectral Clustering. Acta Electronica Sinica 35(8), 1577–1581 (2007)
Google Scholar
Xiang, T., Gong, S.: Spectral clustering with eigenvector selection. Pattern Recognition 41(3), 1012–1029 (2008)
Article MATH Google Scholar
Wu, X., Zhang, S.: Synthesizing High-Frequency Rules from Different Data Sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)
Article Google Scholar
Wu, X., Zhang, C., Zhang, S.: Efficient mining of both positive and negative association rules. ACM Trans. Inf. Syst. 22(3), 381–405 (2004)
Article Google Scholar
Wu, X., Zhang, C., Zhang, S.: Database classification for multi-database mining. Inf. Syst. 30(1), 71–88 (2005)
Article MATH Google Scholar
Zhang, S., Zhang, J., Zhu, X., Qin, Y., Zhang, C.: Missing Value Imputation Based on Data Clusteri ng. Transactions on Computational Science 1, 128–138 (2008)
Google Scholar
Zhang, S., Chen, F., Wu, X., Zhang, C., Wang, R.: Mining bridging rules between conceptual clusters. Applied Intelligence 36(1), 108–118 (2012)
Article Google Scholar
Zhang, J., Zhu, X., Li, X., Zhang, S.: Mining item popularity for recommender systems. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds.) ADMA 2013, Part II. LNCS (LNAI), vol. 8347, pp. 372–383. Springer, Heidelberg (2013)
Chapter Google Scholar
Zhang, S., Zhang, C., Yan, X.: Post-mining: maintenance of association rules by weighting. Inf. Syst. 28(7), 691–707 (2003)
Article Google Scholar
Zhang, S., Qin, Z., Ling, C., Sheng, S.: “Missing Is Useful”: Missing Values in Cost-Sensitive Decision Trees. IEEE Trans. Knowl. Data Eng. 17(12), 1689–1693 (2005)
Article Google Scholar
Zhao, Y., Zhang, S.: Generalized Dimension-Reduction Framework for Recent-Biased Time Series Analysis. IEEE Trans. Knowl. Data Eng. 18(2), 231–244 (2006)
Article Google Scholar
Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing Value Estimation for Mixed-Attribute Data Sets. IEEE Trans. Knowl. Data Eng. 23(1), 110–121 (2011)
Article Google Scholar
Zhu, X., Zhang, L., Huang, Z.: A Sparse Embedding and Least Variance Encoding Approach to Hashing. IEEE Transactions on Image Processing 23(9), 3737–3750 (2014)
Article MathSciNet Google Scholar
Zhu, X., Huang, Z., Shen, H., Zhao, X.: Linear cross-modal hashing for efficient multimedia search. In: ACM Multimedia, pp. 143–152 (2013)
Google Scholar
Zhu, X., Suk, H., Shen, D.: A novel matrix-similarity based loss function for joint regression and classification in AD diagnosis. NeuroImage 100, 91–105 (2014)
Article Google Scholar
Zhu, X., Suk, H., Shen, D.: Matrix-Similarity Based Loss Function and Feature Selection for Alzheimer’s Disease Diagnosis. In: CVPR, pp. 3089–3096 (2014)
Google Scholar
Zhu, X., Huang, Z., Yang, Y., Shen, H., Xu, C., Luo, J.: Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recognition 46(1), 215–229 (2013)
Article MATH Google Scholar
Zhu, X., Huang, Z., Cui, J., Shen, H.: Video-to-Shot Tag Propagation by Graph Sparse Group Lasso. IEEE Transactions on Multimedia 15(3), 633–646 (2013)
Article Google Scholar
Zhu, X., Huang, Z., Cheng, H., Cui, J., Shen, H.: Sparse hashing for fast multimedia search. ACM Trans. Inf. Syst. 31(2), 9 (2013)
Article Google Scholar
Zhu, X., Huang, Z., Shen, H., Cheng, J., Xu, C.: Dimensionality reduction by Mixed Kernel Canonical Correlation Analysis. Pattern Recognition 45(8), 3003–3016 (2012)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Geographic Center of Guangxi, Nanning, Guangxi, 530023, China
Jun Yan
Guangxi Normal University, Guilin, Guangxi, 541004, China
Debo Cheng, Ming Zong & Zhenyun Deng

Authors

Jun Yan
View author publications
You can also search for this author in PubMed Google Scholar
Debo Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ming Zong
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyun Deng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Sun Yat-sen University, Guangzhou, P.R. China
Xudong Luo
The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Jeffrey Xu Yu
Guanxi Normal University, Guilin, P.R. China
Zhi Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, J., Cheng, D., Zong, M., Deng, Z. (2014). Improved Spectral Clustering Algorithm Based on Similarity Measure. In: Luo, X., Yu, J.X., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2014. Lecture Notes in Computer Science(), vol 8933. Springer, Cham. https://doi.org/10.1007/978-3-319-14717-8_50

Download citation

DOI: https://doi.org/10.1007/978-3-319-14717-8_50
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14716-1
Online ISBN: 978-3-319-14717-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improved Spectral Clustering Algorithm Based on Similarity Measure

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Modified Spectral Clustering Algorithm Based on Density

A novel spectral clustering algorithm based on neighbor relation and Gaussian kernel function with only one parameter

An improved density-based adaptive p-spectral clustering algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improved Spectral Clustering Algorithm Based on Similarity Measure

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Modified Spectral Clustering Algorithm Based on Density

A novel spectral clustering algorithm based on neighbor relation and Gaussian kernel function with only one parameter

An improved density-based adaptive p-spectral clustering algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation