Abstract
As the statistical analysis of networks finds application in an increasing number of disciplines, novel methodologies are needed to handle such complexity. In particular, cluster analysis is among the most successful and ubiquitous data exploration and characterisation techniques. In this work, we focus on how to represent networks ensembles for fuzzy clustering. We explore three different network representations based on probability distribution, autoencoders and joint embedding. We compare de facto standard fuzzy computational procedures for clustering multiple networks on synthetic data. Finally, we apply this approach to a real-world case study.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Asur S, Ucar D, Parthasarathy S (2007) An ensemble framework for clustering protein–protein interaction networks. Bioinformatics 23(13):i29–i40
Australian Government, Department of Environment and Energy (2019) Australia’s fourth biennial report. https://unfccc.int/sites/default/files/resource/Australia%20Fourth%20Biennial%20Report.pdf, Accessed 30 Nov 2022
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithm. Plenum Press, New York
Bhatia V, Rani R (2017) A parallel fuzzy clustering algorithm for large graphs using Pregel. Exp Syst Appl 78:135–144
Borchers HW (2021) Pracma: Practical numerical math functions. https://CRAN.R-project.org/package=pracma, R package version 2.3.3
Brandes U, Lerner J, Nagel U (2011) Network ensemble clustering using latent roles. Adv Data Anal Classif 5:81–94
Campello RJ, Hruschka ER (2006) A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst 157(21):2858–2875
Carpi LC, Schieber TA, Pardalos PM et al (2019) Assessing diversity in multiplex networks. Sci Rep 9(1):1–12
Davé RN, Sen S (2002) Robust fuzzy clustering of relational data. IEEE Trans Fuzzy Syst 10(6):713–727
Department of Forestry, Fisheries and the Environment (2021) South Africa’s 4th biennial update report to the united nations framework convention on climate change. https://unfccc.int/sites/default/files/resource/South%20Africa%20BUR4%20to%20the%20UNFCCC.pdf, Accessed 11 Nov 2022
Duroux D, Van Steen K (2023) Netanova: novel graph clustering technique with significance assessment via hierarchical anova. Brief Bioinf. https://doi.org/10.1093/bib/bad029
Environment and Climate Change Canada (2020) Canada’s fourth biennial report on climate change. https://unfccc.int/sites/default/files/resource/br4_final_en.pdf, Accessed on 2022-11-30
European Commission (2020) Second biennial report of the European union under the un framework convention on climate change. https://unfccc.int/sites/default/files/resource/European%20Union_second_biennial_report_under_the_unfccc_%282%29.pdf, Accessed 30 Nov 2022
Fuglede B, Topsoe F (2004) Jensen–Shannon divergence and Hilbert space embedding. In: Proceedings of international symposium on information theory, ISIT 2004, p 31
Ghosh J, Acharya A (2011) Cluster ensembles. Wiley Interdiscip Rev Data Min Knowl Discov 1(4):305–315
Granata I, Guarracino MR, Kalyagin VA, et al (2018) Supervised classification of metabolic networks. In 2018 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 2688–2693
Granata I, Guarracino MR, Kalyagin VA et al (2020) Model simplification for supervised classification of metabolic networks. Ann Math Artif Intell 88(1):91–104
Granata I, Guarracino MR, Maddalena L, et al (2020b) Network distances for weighted digraphs. In: International conference on mathematical optimization theory and operations research, Springer, New York, pp 389–408
Grazioli G, Martin RW, Butts CT (2019) Comparative exploratory analysis of intrinsically disordered protein dynamics using machine learning and network analytic methods. Front Mol Biosci 6:42
Gutiérrez-Gómez L, Delvenne JC (2019) Unsupervised network embeddings with node identity awareness. Appl Net Sci 4(1):82
Havens TC, Bezdek JC, Leckie C, et al (2013) Clustering and visualization of fuzzy communities in social networks. In: 2013 IEEE international conference on fuzzy systems (FUZZ-IEEE), IEEE, pp 1–7
Heckerman D (1997) Bayesian networks for data mining. Data Min Knowl Disc 1(1):79–119
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Jajuga K (1991) L1-norm based fuzzy clustering. Fuzzy Sets Syst 39(1):43–50
Jiang M, Cui P, Beutel A, et al (2014) Inferring strange behavior from connectivity pattern in social networks. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, New York, pp 126–138
Klawonn F, Höppner F (2003) What is fuzzy about fuzzy clustering? understanding and improving the concept of the fuzzifier. In: International symposium on intelligent data analysis, Springer, New York, pp 254–264
Krijthe JH (2015) Rtsne: T-distributed stochastic neighbor embedding using barnes-hut implementation. R package version 013. https://github.com/jkrijthe/Rtsne
Krishnapuram R, Joshi A, Nasraoui O et al (2001) Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Syst 9(4):595–607
Lancichinetti A, Fortunato S (2012) Consensus clustering in complex networks. Sci Rep 2(1):1–7
Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046–110
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, University of California Press, Berkeley, pp 281–297
Manipur I, Granata I, Maddalena L et al (2020) Clustering analysis of tumor metabolic networks. BMC Bioinf 21(10):1–14
Minister of Environment of Chile (2018) Chile’s third biennial update report to the united nations framework convention on climate change. https://unfccc.int/sites/default/files/resource/5769410_Chile-BUR3-1-Chile_3BUR_English.pdf, Accessed on 30 Nov 2022
Ministry of Climite Change Government of Pakistan (2022) Pakistan’s first biennial update report (bur-1) to the united nations framework convention on climate change (unfccc). https://unfccc.int/sites/default/files/resource/Pakistan%E2%80%99s%20First%20Biennial%20Update%20Report%20%28BUR-1%29%20-%202022.pdf, Accessed on 30 Nov 2022
Ministry of Energy, Industry and Mineral Resources, Kingdom of Saudi Arabia (2018) The first biennial update report. https://unfccc.int/sites/default/files/resource/18734625_Saudi%20Arabia-BUR1-1-BUR1-Kingdom%20of%20Saudi%20Arabia.pdf, Accessed on 30 Nov 2022
Ministry of Environment, Egyptian Environmental Affairs Agency (2018) Egypt’s first biennial update report to the united nations framework convention on climate change. https://unfccc.int/sites/default/files/resource/BUR%20Egypt%20EN.pdf, Accessed on 30 Nov 2022
Ministry of Environment Forest and Climate Change Government of India (2021) India, third biennial update report to the united nations framework convention on climate change. https://unfccc.int/sites/default/files/resource/INDIA_%20BUR-3_20.02.2021_High.pdf, Accessed on 30 Nov 2022
Ministry of Foreign Affairs, Ministry of Science, Technology and Innovations (2020) Fourth biennial update report of brazil to the united nations framework convention on climate change. https://unfccc.int/sites/default/files/resource/BUR4.Brazil.pdf, Accessed on 30 Nov 2022
Moody J, Mucha PJ (2013) Portrait of political party polarization1. Net Sci 1(1):119–121
Ni J, Cheng W, Fan W et al (2017) COMCLUS: a self-grouping framework for multi-network clustering. IEEE Trans knowl Data Eng 30(3):435–448
Obando C, de Vico Fallani F (2017) A statistical model for brain networks inferred from large-scale electrophysiological signals. J R Soc Interf 14(128):20160–940
Ou-Yang L, Yan H, Zhang XF (2017) A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks. BMC Bioinf 18(13):23–34
Rousseeuw PJ, Kaufman L (1990) Finding groups in data. Wiley Online Library, Hoboken
Runkler TA, Ravindra V (2015) Fuzzy graph clustering based on non-euclidean relational fuzzy c-means. In: 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (IFSA-EUSFLAT-15), Atlantis Press, New York
Simpson SL, Hayasaka S, Laurienti PJ (2011) Exponential random graph modeling for complex brain networks. PloS One 6(5):e20-039
Slaughter AJ, Koehly LM (2016) Multilevel models for social networks: hierarchical Bayesian approaches to exponential random graph modeling. Soc Net 44:334–345
Tagarelli A, Amelio A, Gullo F (2017) Ensemble-based community detection in multilayer networks. Data Min Knowl Disc 31(5):1506–1543
Tang L, Liu H (2011) Leveraging social media networks for classification. Data Min Knowl Disc 23(3):447–478
Tantardini M, Ieva F, Tajoli L et al (2019) Comparing methods for comparing networks. Sci Rep 9(1):1–19
United Nations Climate Change (2022) Output Report of Africa Climate Week 2022. https://unfccc.int/sites/default/files/resource/ACW2022_OutputReport_10102022.pdf, Accessed on 30 Nov 2022
United Nations Environment Programme (2019) Emissions gap report 2019. https://wedocs.unep.org/bitstream/handle/20.500.11822/30797/EGR2019.pdf?sequence=1 &isAllowed=y, Accessed on 30 Nov 2022
Van der Maaten L (2014) Accelerating t-SNE using tree-based algorithms. J Mach Learn Res 15(1):3221–3245
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605
Vinh N, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Wang S, Arroyo J, Vogelstein JT et al (2021) Joint embedding of graphs. IEEE Trans Pattern Anal Mach Intell 43:1324–1336
World Bank Group (2022a) China, country climate and development report. https://openknowledge.worldbank.org/bitstream/handle/10986/38136/FullReport.pdf, Accessed on 30 Nov 2022
World Bank Group (2022b) Country climate and development report: Argentina. https://openknowledge.worldbank.org/bitstream/handle/10986/38252/ARG_CCDR_FullReport.pdf?sequence=6 &isAllowed=y, Accessed on 30 Nov 2022
World Bank Group (2022c) Country climate and development report: Perù. https://openknowledge.worldbank.org/bitstream/handle/10986/38251/EnglishReport.pdf?sequence=2 &isAllowed=y, Accessed on 30 Nov 2022
Yang X, Liu J, Cheung WKW, et al (2014) Inferring metapopulation based disease transmission networks. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, New York, pp 385–399
Yin F, Shen W, Butts CT (2022) Finite mixtures of ERGMS for modeling ensembles of networks. Bayesian Anal 1(1):1–39
Zaidi F (2012) Fuzzy clustering and visualization of information for web search results. J Int Technol 13:939–952
Acknowledgements
This work has been partially funded by the BiBiNet project (H35F21000430002) within POR-Lazio FESR 2014–2020.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors declare to have no competing financial and non-financial interests.
Additional information
Responsible editor: Srinivasan Parthasarathy.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bombelli, I., Manipur, I., Guarracino, M.R. et al. Representing ensembles of networks for fuzzy cluster analysis: a case study. Data Min Knowl Disc 38, 725–747 (2024). https://doi.org/10.1007/s10618-023-00977-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-023-00977-x