Abstract
Maximizing modularity is currently the most widely-used community detection method in applications. Modularity comes with a parameter that indirectly controls the granularity of the resulting clustering. Moreover, one can choose this parameter in such a way that modularity maximization becomes equivalent to maximizing the likelihood of a stochastic block model. Thus, this method is statistically justified, while at the same time, it is known to have a bias towards fine-grained clusterings. In this work, we introduce a heuristic to correct for this bias. This heuristic is based on prior work where modularity is described in geometric terms. This has led to a broad generalization of modularity-based community detection methods, and the heuristic presented in this paper applies to each of them. We justify the heuristic by describing a relation between several distances that we observe to hold in many instances. We prove that, assuming the validity of this relation, our heuristic leads to a clustering of the same granularity as the ground-truth clustering. We compare our heuristic to likelihood-based community detection methods on several synthetic graphs and show that our method indeed results in clusterings with granularity closer to the granularity of the ground-truth clustering. Moreover, our heuristic often outperforms likelihood maximization in terms of similarity to the ground-truth clustering.
Supported by the Netherlands Organisation for Scientific Research (NWO) through the Gravitation NETWORKS grant no. 024.002.003.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Or, equivalently, the Chung-Lu model.
- 2.
- 3.
We use \(\xi =0.25\), degree bounds 4 and 100, community-size bounds 10 and \(\lfloor n^{3/4}\rfloor \).
References
Arenas, A., Fernandez, A., Gomez, S.: Analysis of the structure of complex networks at different resolution levels. New J. Phys. 10(5), 053039 (2008)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008(10), P10008 (2008)
Fortunato, S., Barthélemy, M.: Resolution limit in community detection. Proc. Natl. Acad. Sci. 104(1), 36–41 (2007). https://doi.org/10.1073/pnas.0605965104. https://www.pnas.org/content/104/1/36
Gösgens, M., van der Hofstad, R., Litvak, N.: The hyperspherical geometry of community detection: modularity as a distance. J. Mach. Learn. Res. 24, 1–33 (2023)
Gösgens, M., Zhiyanov, A., Tikhonov, A., Prokhorenkova, L.: Good classification measures and how to find them. Adv. Neural. Inf. Process. Syst. 34, 17136–17147 (2021)
Gösgens, M.M., Tikhonov, A., Prokhorenkova, L.: Systematic analysis of cluster similarity indices: how to validate validation measures. In: International Conference on Machine Learning, pp. 3799–3808. PMLR (2021)
Hunter, P.R., Gaston, M.A.: Numerical index of the discriminatory ability of typing systems: an application of simpson’s index of diversity. J. Clin. Microbiol. 26(11), 2465–2466 (1988)
Jaccard, P.: The distribution of the flora in the alpine zone. 1. New Phytol. 11(2), 37–50 (1912)
Kamiński, B., Pankratz, B., Prałat, P., Théberge, F.: Modularity of the ABCD random graph model with community structure. J. Complex Netw. 10(6), cnac050 (2022)
Kamiński, B., Prałat, P., Théberge, F.: Artificial benchmark for community detection (ABCD)-fast random graph model with community structure. Netw. Sci. 9(2), 153–178 (2021)
Kumpula, J.M., Saramäki, J., Kaski, K., Kertész, J.: Limited resolution in complex network community detection with Potts model approach. Eur. Phys. J. B 56(1), 41–45 (2007)
Lancichinetti, A., Fortunato, S.: Limits of modularity maximization in community detection. Phys. Rev. E 84(6), 066122 (2011)
Lichev, L., Mitsche, D.: On the modularity of 3-regular random graphs and random graphs with given degree sequences. Random Struct. Algorithms 61(4), 754–802 (2022)
McDiarmid, C., Skerman, F.: Modularity of erdős-rényi random graphs. Random Struct. Algorithms 57(1), 211–243 (2020)
Newman, M.E.: Equivalence between modularity optimization and maximum likelihood methods for community detection. Phys. Rev. E 94(5), 052315 (2016)
Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)
Peixoto, T.P.: Descriptive vs. inferential community detection: pitfalls, myths and half-truths. arXiv preprint arXiv:2112.00183 (2021)
Prokhorenkova, L.: Using synthetic networks for parameter tuning in community detection. In: Avrachenkov, K., Prałat, P., Ye, N. (eds.) WAW 2019. LNCS, vol. 11631, pp. 1–15. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25070-6_1
Prokhorenkova, L., Tikhonov, A.: Community detection through likelihood optimization: in search of a sound model. In: The World Wide Web Conference, pp. 1498–1508 (2019)
Ostroumova Prokhorenkova, L., Prałat, P., Raigorodskii, A.: Modularity of complex networks models. In: Bonato, A., Graham, F.C., Prałat, P. (eds.) WAW 2016. LNCS, vol. 10088, pp. 115–126. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49787-7_10
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74(1), 016110 (2006)
Traag, V.A., Van Dooren, P., Nesterov, Y.: Narrow scope for resolution-limit-free community detection. Phys. Rev. E 84(1), 016114 (2011)
Zhang, L., Peixoto, T.P.: Statistical inference of assortative community structures. Phys. Rev. Res. 2(4), 043271 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gösgens, M., van der Hofstad, R., Litvak, N. (2023). Correcting for Granularity Bias in Modularity-Based Community Detection Methods. In: Dewar, M., Prałat, P., Szufel, P., Théberge, F., Wrzosek, M. (eds) Algorithms and Models for the Web Graph. WAW 2023. Lecture Notes in Computer Science, vol 13894. Springer, Cham. https://doi.org/10.1007/978-3-031-32296-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-32296-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-32295-2
Online ISBN: 978-3-031-32296-9
eBook Packages: Computer ScienceComputer Science (R0)