Abstract
Data redundancy is frequently encountered in biologically data. Locality preserving projection (LPP) is a dimensionality reduction approach to mitigate the data redundancy while preserving the substantial geometry inspired by biological processes. Its application can contribute promisingly to the fuzzy c-means (FCM) clustering. However, the existing locality preserving based FCM clustering methods that combine LPP with FCM focus only on the local information, probably resulting in somewhat conservatism. A novel FCM clustering method, namely, projected fuzzy double c-means clustering using sparse self-representation (PFD SSR), is developed in this paper. The main idea of PFD SSR is three-fold: (1) Inspired by biological processes, a so-called sparse self-representation (SSR) method is employed. Hence, the global data distribution is investigated so as to enhance the clustering performance; (2) LPP is utilized to handle both the raw data and the dictionary matrix obtained by SSR, which greatly reduces the feature dimensions and solidly preserves the intrinsic data distribution. In addition, the regularization terms of these two achievements under projection are introduced to the FCM’s objective function, which helps reduce the risk of being trapped into local optima during the model training; and (3) the alternative direction technique is applied to learn the model. The experimental results on 11 datasets including 6 biologically data sets demonstrated the proposed method outperforms the state-of-art clustering methods. The proposed subspace clustering method has a good ability of handling the high-dimensional data, especially biological data.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
Data openly available in a public repository. The data that support the findings of this study are openly available at http://www.ics.uci.edu/ml.
References
Shen Q, Zhang Q, Zhao F, Wang G. Adaptive three-way c-means clustering based on the cognition of distance stability. Cogn Comput. 2022;14(2):563–80.
Bian X, Zhang T, Zhang X, Yan L, Li B. Clustering-based extraction of near border data samples for remote sensing image classification. Cogn Comput. 2013;5(1):19–31.
Zhang L, Zhang L, Du B, You J, Tao D. Hyperspectral image unsupervised classification by robust manifold matrix factorization. Inf Sci. 2019;485:154–69.
Liu F, Jiao L, Tang X. Task-oriented GAN for PolSAR image classification and clustering. IEEE Trans Neural Netw Learn Syst. 2019;30(9):2707–19.
Zeng N, Li H, Wang Z, Liu W, Liu X. Deep-reinforcement-learning-based images segmentation for quantitative analysis of gold immunochromatographic strip. Neurocomputing. 2020;425:173–80.
Tirandaz Z, Akbarizadeh G, Kaabi H. PolSAR image segmentation based on feature extraction and data compression using weighted neighborhood filter bank and hidden Markov random field-expectation maximization. Measurement. 2020;153.
Yan X, Shi K, Ye Y, Yu H. Deep correlation mining for multi-task image clustering. Expert Syst Appl. 2022;187: 115973.
Yan X, Ye Y, Qiu X, Yu H. Synergetic information bottleneck for joint multi-view and ensemble clustering. Inf Fusion. 2020;56:15–27.
Karczmarek P, Kieasztyn A, Pedrycz W, Al E. K-means-based isolation forest. Knowl-Based Syst. 2020;195: 105659.
Luo X, Zhou M, Li S, Xia Y, You Z, Zhu Q, Leung H. Incorporation of efficient second-order solvers into latent factor models for accurate prediction of missing QoS data. IEEE Trans Cybern. 2017;48(4):1216–28.
Wei G, Mu W, Song Y, Dou J. An improved and random synthetic minority oversampling technique for imbalanced data. Knowl-Based Syst. 2022;248: 108839.
Wang X, Wang Z, Sheng M, Li Q, Sheng W. An adaptive and opposite k-means operation based memetic algorithm for data clustering. Neurocomputing. 2021;437:131–42.
Luo X, Yuan X, Zhou M, Liu Z, Shang M. Non-negative latent factor model based on \(\beta\)-divergence for recommender systems. IEEE Trans Industr Inf. 2019;51(8):4612–23.
Wen L, Zhou K, Yang S. A shape-based clustering method for pattern recognition of residential electricity consumption. J Clean Prod. 2019;212:475–88.
Ashraf Z, Khan MS, Lohani QD. New bounded variation based similarity measures between Atanassov intuitionistic fuzzy sets for clustering and pattern recognition. Appl Soft Comput. 2019;85.
MacQueen J, Cam LL, Neyman J. Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. 1967;1:281–97.
Bezdek JC, Ehrlich R. Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci. 1984;10(2):191–203.
Xia SX, Meng FR, Liu B, Zhou Y. A kernel clustering-based possibilistic fuzzy extreme learning machine for class imbalance learning. Cogn Comput. 2015;7(1):74–85.
Pal NR, Pal K, Keller JM, Bezdek JC. A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Syst. 13(4)517–530.
Gu J, Jiao L, Yang S. Fuzzy double c-means clustering based on sparse self-representation. IEEE Trans Fuzzy Syst. 2018;26(2):612–26.
Deng ZH, Choi KS, Jiang YZ. A survey on soft subspace clustering. Inf Sci. 2016;348:84–106.
Cheng H, Wang Z, Wei Z, Ma L, Liu X. On adaptive learning framework for deep weighted sparse autoencoder: a multiobjective evolutionary algorithm. IEEE Trans Cybern. 2022;52(5):3221–31.
Keller A, Klawonn F. Fuzzy clustering with weighting of data variables. Internat J Uncertain Fuzziness Knowledge-Based Systems. 2000;8:735–46.
Guillon A, Lesot MJ, Marsala C. Laplacian regularization for fuzzy subspace clustering. Proceeding of the IEEE International Conference on Fuzzy Systems, Naples, Italy; 2017. pp. 1–6.
Nasser A, Hamad D, Nasr C. K-means clustering algorithm in projected spaces. Proceeding of 2006 9th International Conference on Inforamtion Fusion, Florence, Italy; 2006. pp. 1–6.
Popescu M, Keller J, Bezdek J. Random projections fuzzy c-means. Proceeding of 2015 IEEE International Conference on Fuzzy Systems, Istanbul, Turkey; 2015. pp. 1–6.
DeSarbo W, Jedidi K, Cool K, Schendel D. Simultaneous multidimensional unfolding and cluster analysis: an investigation of strategic groups. Mark Lett. 1991;2:129–46.
Seote GD, Carroll JD. K-means clustering in a low-dimensional Euclidean space. New Approaches in Classication and Data Analysis; 1994. pp. 212–219.
Zhou J, Pedrycz W, Yue X, Gao C, Lai Z, Wang J. Projected fuzzy c-means clustering with locality preservation. Pattern Recogn. 2021;113: 107748.
He YL, ZhaoY, Hu X, Yan XN, Zhu QX, Xu Y. Fault diagnosis using novel AdaBoost based discriminant locality preserving projection with resamples. Eng Appl Artif Intell. 2020;91, 103631.
He XF, Yan SC, Hu YX. Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell. 2003;27(3):328–40.
Sun Y, Gao Z, Wang H, Shim B, Gui G, Mao G, Adachi F. Principal component analysis-based broadband hybrid precoding for millimeter-wave massive MIMO systems. IEEE Trans Wireless Commun. 2020;19(10):6331–46.
Zhu F, Gao J, Yang J, Ye N. Neighborhood linear discriminant analysis. Pattern Recogn. 2022;123: 108422.
Aharon M, Elad M, Alfred B. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process. 2006;54(11):4311–22.
Gabay D, Mercier B. A cost-sensitive classification algorithm: bee-miner. Comput Math Appl. 1976;2:17–40.
Shi JB, Malik J. Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell. 2000;22(8):888–905.
Elhamifar E, Vidal R. Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell. 2013;35(11):2765–81.
Funding
This work was supported in part by the National Natural Science Foundation of China under Grants (Nos. 62073223, 61873169), the Natural Science Foundation of Shanghai under Grant 22ZR1443400, and the Open Project of Key Laboratory of Aerospace Flight Dynamics and National Defense Science and Technology under Grant 6142210200304.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tian, X., Sun, C., Sun, Y. et al. A Biologically-Inspired Sparse Self-Representation Approach for Projected Fuzzy Double C-Means Clustering. Cogn Comput 15, 2202–2215 (2023). https://doi.org/10.1007/s12559-023-10185-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-023-10185-w