Abstract
Data clustering refers to partition the original data set into some subsets such that every vertex belongs to one or more subsets at the same time. For graph data that composed by attribute information of vertices as well as structural information between vertices, how to make an efficient clustering is not an easy thing. In this paper, we propose a novel method of how to partition graph data into some overlapping subgraph data in aspect of rough set theory. At first, we introduce a detailed description about the global similarity measurement of vertices. After that, an objective-function oriented optimization model is constructed in terms of updating fuzzy membership degree and cluster center that based on the theory of rough set. Obviously, the determined cluster is no longer a fuzzy set, but a rough set, that is to say, the cluster is expressed by the upper approximation set and lower approximation set. Finally, eleven real-world graph data and four synthetic graph data are applied to verify the validity of the proposed fuzzy clustering algorithm. The experimental results show that our algorithm is better than existing clustering approach to some extent.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Without loss of generality, in this paper we adhere to the hypothesis that \(e_{jj^{\prime }}\in \{0,1\}\) for all \(j, j^{\prime }=1,2,\ldots ,n\). What is more, it should be pointed out that \(e_{jj^{\prime }}=e_{j^{\prime }j}\), that is to say, \({\mathscr {G}}=({\mathscr {X}},{\mathscr {E}})\) is an undirected graph data.
For a special case that \(m=0\), then FCM turns into the famous k-means clustering algorithm.
In fact, the parameter \(\beta\) is not fixed in advance. In Sect. 5, we will discuss the problem that which value of \(\beta\) is the best for certain data set.
References
Wen LL, Zhou KL, Yang SL (2019) A shape-based clustering method for pattern recognition of residential electricity consumption. J Clean Prod 212:475–488
Li C, Kulwa F, Zhang JH et al (2021) A review of clustering methods in microorganism image analysis. Inf Technol Biomed 1186:13–25
Kookueva VV, Tsertseil JS (2018) Clustering as a basis for an innovative development strategy. Eur Res Stud J 21:818–830
Pocol CB, Marinescu V, Dabija DC et al (2021) Clustering generation Z university students based on daily fruit and vegetable consumption: empirical research in an emerging market. Br Food J 123:2706–2727
Allahyari M, Pouriyeh S, Assefi M et al (2017) A brief survey of text mining: classification, clustering and extraction techniques. arXiv:1707.02919
Janani R, Vijayarani S (2019) Text document clustering using spectral clustering algorithm with particle swarm optimization. Expert Syst Appl 134:192–200
Steinbach M, Ert\(\ddot{o}\)z L, Kumar V (2004) The challenges of clustering high dimensional data. In: New directions in statistical physics. Springer, Berlin, pp 273–309
Afzali M, Kumar S (2019) Text document clustering: issues and challenges. In: Proceedings of the 2019 international conference on machine learning, big data, cloud and parallel computing, pp 263–268
Bothorel C, Cruz JD, Magnani M et al (2015) Clustering attributed graphs: models, measures and methods. Netw Sci 3:408–444
Li ZT, Liu J, Wu K (2018) A multiobjective evolutionary algorithm based on structural and attribute similarities for community detection in attributed networks. IEEE Trans Cybern 48:1963–1976
Zhou HF, Li J, Li JH et al (2017) A graph clustering method for community detection in complex networks. Phys A 469:551–562
Qin XW, Han XX, Chu JW et al (2021) Density peaks clustering based on Jaccard similarity and label propagation. Cogn Comput 13:1609–1626
Xu J (2011) Graph clustering algorithm based on the degree and the number of vertices. Dalian Maritime University, pp 2–3
Xiong H, Wu JJ, Chen J (2009) K-means clustering versus validation measures: a data-distribution perspective. IEEE Trans Syst Man Cybern Part B (Cybernetics) 39:318–331
Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10:191–203
Zhou KL, Yang SL (2016) Exploring the uniform effect of FCM clustering: a data distribution perspective. Knowl Based Syst 96:76–83
Wang C, Pan SR, Hu RQ et al (2019) Attributed graph clustering: a deep attentional embedding approach. arXiv:1906.06532
Fan SH, Wang X, Shi C et al (2020) One2multi graph autoencoder for multi-view graph clustering. In: Proceedings of the web conference, pp 3070–3076
Li XL, Zhang HW, Zhang R et al (2021) Adaptive graph auto-encoder for general data clustering. IEEE Trans Pattern Anal Mach Intell 1–9
Liao RJ, Brockschmidt M, Tarlow D et al (2018) Graph partition neural networks for semi-supervised classification. arXiv:1803.06272
Zhang XT, Liu H, Li QM et al (2019) Attributed graph clustering via adaptive graph convolution. arXiv:1906.01210
Lee C, Wilkinson DJ (2019) A review of stochastic block models and extensions for graph clustering. Appl Netw Sci 4:1–50
Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endow 2:718–729
Kulis B, Basu S, Dhillon I et al (2009) Semi-supervised graph clustering: a kernel approach. Mach Learn 74:1–22
Li XC, Yin HZ, Zhou K et al (2020) Semi-supervised clustering with deep metric learning and graph embedding. World Wide Web 23:781–798
Zhang Y, Wu B, Liu Y (2017) A novel community detection method based on rough set k-means. J Electron Inf Technol 39:770–777
Gupta S, Kumar P (2020) An overlapping community detection algorithm based on rough clustering of links. Data Knowl Eng 125:101777
Cai YD, Huang JZ, Yin JF (2022) A new method to build the adaptive k-nearest neighbors similarity graph matrix for spectral clustering. Neurocomputing 493:191–203
Callum S, Wang JB, Li YY (2020) Quantum walk inspired algorithm for graph similarity and isomorphism. Quantum Inf Process 19:280–299
Li JY, Jiang W, Han H et al (2021) ScGSLC: an unsupervised graph similarity learning framework for single-cell RNA-seq data clustering. Comput Biol Chem 90:107415
Fouss F, Pirotte A, Renders JM et al (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans Knowl Data Eng 19:355–369
Cai HY, Zheng VW, Chang KC (2017) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30:1616–1637
Jinarat S, Manaskasemsak B, Rungsawang A (2019) Short text clustering based on word semantic graph with word embedding model. In: Proceedings of the 10th international conference on soft computing and intelligent systems (SCIS), pp 1427–1432
Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11:341–356
Hashemzadeh M, Oskouei AG, Farajzadeh N (2019) New fuzzy c-means clustering method based on feature-weight and cluster-weight learning. Appl Soft Comput 78:324–345
Knuth DE (1993) The Stanford graphbase: a platform for combinatorial computing. Addison-Wesley, Reading
Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74:1–22
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 7821–7826
White JG, Southgate E, Thomson JN et al (1986) The structure of the nervous system of the nematode Caenorhabditis elegans. Philos Trans R Soc Lond 314:1–340
Prithviraj S, Galileo N, Mustafa B et al (2008) Collective classification of network data. AI Mag 29:93–106
Zhou WP, Lu L (2010) Link prediction based on local random walk. EPL (Europhysics Letters) 89:58007
Mitra S, Banka H, Pedrycz W (2006) Rough-fuzzy collaborative clustering. IEEE Trans Syst Man Cybern Part B (Cybernetics) 36:795–805
Ding Y, Fu X (2016) Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing 188:233–238
Zhan H, Chen P, Zhang XF (2020) Overlapping community partition based on rough fuzzy clustering algorithm. Inf Syst Eng 3:89–94
Acknowledgements
We are hugely grateful to the possible anonymous reviewers for their constructive comments with respect to the original manuscript. What is more, we thank the National Natural Science Foundation of China (Nos. 61966039, 11971065) and the Yunnan Province Education Department Scientific Research Fund Project (No. 2021Y670).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
He, W., Liu, S., Xu, W. et al. On rough set based fuzzy clustering for graph data. Int. J. Mach. Learn. & Cyber. 13, 3463–3490 (2022). https://doi.org/10.1007/s13042-022-01607-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01607-6