Partial and Overlapping Community Detection in Multiplex Social Networks

Afsarmanesh Tehrani, Nazanin; Magnani, Matteo

doi:10.1007/978-3-030-01159-8_2

Nazanin Afsarmanesh Tehrani^16,17 &
Matteo Magnani^16,17

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11186))

Included in the following conference series:

International Conference on Social Informatics

1410 Accesses
4 Citations

Abstract

We extend the popular clique percolation method to multiplex networks. Our extension requires to rethink the basic concepts on which the original clique percolation algorithm is based, including cliques and clique adjacency, to handle the presence of multiple types of ties. We also provide an experimental characterization of the communities that our method can identify.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 5719; Price includes VAT (Japan)

Softcover Book: JPY 7149; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pushing the Envelope in Overlapping Communities Detection

An Algorithm for Detecting Communities in Social Networks

Article 19 October 2015

Finding multifaceted communities in multiplex networks

Article Open access 24 June 2024

Notes

1.
Using the generalized Louvain method, every pair (v, l) is included in exactly one community.
2.
https://CRAN.R-project.org/package=multinet.

References

Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Article MathSciNet Google Scholar
Coscia, M., Giannotti, F., Pedreschi, D.: A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 4(5), 512–546 (2011)
Article MathSciNet Google Scholar
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)
Article Google Scholar
Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J.P., Moreno, Y., Porter, M.A.: Multiplex Networks. J. Complex Netw. 2(3), 203–271 (2014)
Article Google Scholar
Dickison, M.E., Magnani, M., Rossi, L.: Multiplex Social Networks. Cambridge University Press, Cambridge (2016)
Book Google Scholar
Sun, Y., Han, J.: Mining Heterogeneous Information Networks: Principles and Methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery. Morgan & Claypool Publishers, San Rafael (2012)
Google Scholar
Bott, H.: Observation of play activities in a nursery school. Genet. Psychol. Monogr. 4, 44–88 (1928)
Google Scholar
Moreno, J.L.: Who Shall Survive? A New Approach to the Problem of Human Interrelations. Nervous and Mental Disease Publishing Co., Washington, D. C. (1934)
Book Google Scholar
Xie, J., Kelley, S., Szymanski, B.K.: overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput. Surv. (CSUR) 45(4) (2013)
Article Google Scholar
Kumpula, J.M., Kivelä, M., Kaski, K., Saramäki, J.: Sequential algorithm for fast clique percolation. Phys. Rev. 78(2), 026109 (2008)
Google Scholar
Yan, B., Gregory, S.: Detecting communities in networks by merging cliques. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2009, vol. 1, pp. 832–836. IEEE (2009)
Google Scholar
Nepusz, T., Petróczi, A., Négyessy, L., Bazsó, F.: Fuzzy communities and the concept of bridgeness in complex networks. Phys. Rev. E 77(1), 016107 (2008)
Article MathSciNet Google Scholar
Zhang, S., Wang, R.-S., Zhang, X.-S.: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys. A Stat. Mech. Its Appl. 374(1), 483–490 (2007)
Article Google Scholar
Ahn, Y.-Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010)
Article Google Scholar
Evans, T.S., Lambiotte, R.: Line graphs, link partitions, and overlapping communities. Phys. Rev. E 80(1), 016105 (2009)
Article Google Scholar
Bothorel, C., Cruz, J.D., Magnani, M., Micenkova, B.: Clustering attributed graphs: models, measures and methods. Netw. Sci. 3(3), 408–444 (2015)
Article Google Scholar
Berlingerio, M., Coscia, M., Giannotti, F.: Finding and characterizing communities in multidimensional networks. In: International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 490–494 (2011)
Google Scholar
Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Community mining from multi-relational networks. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 445–452. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_44
Chapter Google Scholar
Rodriguez, M.A., Shinavier, J.: Exposing multi-relational networks to single-relational network analysis algorithms. J. Inf. 4(1), 29–41 (2010)
Google Scholar
Tang, L., Wang, X., Liu, H.: Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov. 25(1), 1–33 (2012)
Article MathSciNet Google Scholar
Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.-P.: Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–8 (2010)
Article MathSciNet Google Scholar
Berlingerio, M., Pinelli, F., Calabrese, F.: ABACUS: frequent pAttern mining-BAsed Community discovery in mUltidimensional networkS. Data Min. Knowl. Discov. 27(3), 294320 (2013)
Article MathSciNet Google Scholar
De Domenico, M., Lancichinetti, A., Arenas, A., Rosvall, M.: Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems. Phys. Rev. X 5(1), 11027 (2015)
Google Scholar
Afsarmanesh, N., Magnani, M.: Finding overlapping communities in multiplex networks. https://arxiv.org/abs/1602.03746
Magnani, M., Rossi, L.: The ML-model for multi-edge type social networks. In: International Conference on Social Network Analysis and Mining (ASONAM), pp. 5–12. IEEE Computer Society (2011)
Google Scholar
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)
Article Google Scholar
Rossi, L., Magnani, M.: Towards effective visual analytics on multiplex and multilayer networks. Chaos Solitons Fractals 72, 68–76 (2015)
Article MathSciNet Google Scholar
Leskovec, J., Lang, K.J., Mahoney, M.W., Dasgupta, A.: Statistical properties of community structure in large social and information networks. In: Proceeding of the 17th International Conference on World Wide Web - WWW 2008, p. 695 (2008)
Google Scholar

Download references

Acknowledgments

This work was partially funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 732027.

Author information

Authors and Affiliations

Gavagai, Stockholm, Sweden
Nazanin Afsarmanesh Tehrani & Matteo Magnani
InfoLab, Department of Information Technology, Uppsala University, Uppsala, Sweden
Nazanin Afsarmanesh Tehrani & Matteo Magnani

Authors

Nazanin Afsarmanesh Tehrani
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Magnani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Magnani .

Editor information

Editors and Affiliations

University of Koblenz, Koblenz, Germany
Steffen Staab
National Research University Higher School of Economics, St. Petersburg, Russia
Olessia Koltsova
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov

A Algorithm

The algorithm is divided into three parts, as in the original method: finding cliques, building the adjacency graph, and extracting communities.

1.1 A.1 Finding Cliques

Our algorithms starts by locating all maximal k-m-AND-cliques. Algorithm 1 is an extension of Bron–Kerbosch’s algorithm. It is a recursive algorithm where the recursive step takes a clique A as input and returns all maximal k-m-AND-cliques containing A that can be constructed using vertices in B, with \(k \ge \overline{k}\) and \(m \ge \overline{m}\). In this way, given a multiplex network \(M = (V,E,\mathcal {L})\), a call to \(\text {find-cliques}(\{\}, V, \{\}, \overline{k}, \overline{m})\) with \(\overline{k}>1\) and \(\overline{m}>0\) returns all maximal cliques in M with \(k \ge \overline{k}\) and \(m \ge \overline{m}\).

The algorithm works by updating two sets: one containing vertices that can be used to extend the currently processed clique, and one to keep track of already examined cliques. More precisely, the parameter B is a set of vertices such that, for every vertex \(n \in B\), \(A \cup \{n\}\) is a previously unseen clique on at least \(\overline{m}\) edge types. Whenever a vertex from B is used to extend the clique A, then B is updated by removing those vertices that are no longer connected to all vertices in the new clique \(A'\). C is the same as B, but containing those vertices that have already been examined by the algorithm during some previous iteration, so that no duplicates are produced. Given a set of vertices A we notate the set of edge types where the vertices in A form a clique as L(A), and the number of vertices in A as S(A). Therefore, if \(|L(A)|=0\) then A is not a clique on any edge type. We also define \(\max (\emptyset ) = 0\).

As an example, assume we call \(\text {find-cliques}(\{\}, \{0, 1, \dots , 9\}, \{\}, 3, 1)\) on the network in Fig. 3. The algorithm would then start exploring one of the vertices in B, let us say 5. The new call will thus include in B only those vertices that can still form a clique on at least one edge type when joined with 5: \(\text {find-cliques}(\{5\}, \{2, 3, 4, 6, 8\}, \{\}, 3, 1)\). Let us now assume that at the next iteration 3 is added to the current clique: \(\text {find-cliques}(\{3, 5\}, \{2, 6\}, \{\}, 3, 1)\), and then 6: \(\text {find-cliques}(\{3, 5, 6\}, \{2\}, \{\}, 3, 1)\). At this point S(A) is 3, satisfying the minimum clique size, and \(|L(\{3, 5, 6\})| = 2 > \max (\{|L(A \cup \{b\})| : b \in B\}) = \max (\{|L(\{2, 3, 5, 6\})|\}) = 1\). In fact, \(\{3, 5, 6\}\) is a clique on two edge types, while \(\{2, 3, 5, 6\}\) is a clique on only one edge type. Therefore, the current clique is returned as a maximal one (c4 in Fig. 3). At the next iteration, \(\text {find-cliques}(\{2, 3, 5, 6\}, \{\}, \{\}, 3, 1)\) is called and clique c3 is returned. At some later point, the algorithm would call \(\text {find-cliques}(\{2, 3, 5\}, \{\}, \{6\}, 3, 1)\), not returning any clique because \(C = \{6\}\) indicates that this path has already been explored, and \(\text {find-cliques}(\{4, 5\}, \{2\}, \{3\}, 3, 1)\), ultimately leading to the discovery of clique c1, and so on.

1.2 A.2 Clique-Adjacency Graph

In a simple graph, each clique is included in exactly one community, therefore, communities can be identified from a clique-clique overlap matrix (see [3] for the details). However, this statement is not necessarily true for k-m-AND-cliques and the corresponding communities. Because of the more complicated relations between cliques, instead of the overlap matrix used in the original method we generate an adjacency graph as the one represented in Fig. 3, where each vertex of the graph corresponds to a maximal clique and an edge between two vertices indicates that the corresponding cliques share at least k vertices and at least m edge types on all of their pairs of vertices. In the graph we have indicated for each vertex the edge types where the corresponding clique is defined. In the following we refer to this graph as clique-adjacency graph.

1.3 A.3 From the Clique-Adjacency Graph to Communities

As previously mentioned, each clique can be included in different communities with different combinations of its adjacent cliques. Here our objective is using the adjacency graph and the information regarding the edge labels simultaneously to find communities in the Multiplex network. Two cliques can be included in at least one community if: (1) there exists a path between the corresponding vertices in the clique-adjacency graph, and (2) for all vertices in the path the corresponding cliques share at least m edge types on all of their pairs. We call the latter rule the cliques’ constraint, and we call a tree maximal if no other adjacent clique can be added without reducing the maximal m for which the cliques’ constraint is satisfied. Therefore, each community in the original network corresponds to a maximal tree in the clique-adjacency graph where the cliques’ constraint holds for all vertices in the tree. So the problem is equivalent to recognizing all such maximal trees in the graph.

Algorithm 3 takes a community A as input and returns all maximal communities containing A. In this algorithm, given a set of cliques (that is, a community) A we notate the set of edge types we are currently considering to find a maximal community as \(\varLambda (A)\). Notice that \(\varLambda (A) \subseteq L(A)\): we are going to run this algorithm multiple times starting from the same clique with different \(\varLambda (A)\)’s, that is, looking for communities in different sets of edge types. We also define N(c) as the neighbors of c in the adjacency graph, as before.

Algorithm 3 consists of two phases. First, it finds a maximal community on \(\varLambda (A)\) (lines 1–15). Then, it recursively does the same for all subsets of \(\varLambda (A)\) for which larger communities can be found (lines 17–20). B contains the set of cliques that may be used to extend the community A.

To understand the algorithm we invite the reader to start inspecting lines 1–3, 7–8, and 13–15: this is just a simple depth-first visit of the adjacency matrix, finding the largest connected component containing A only traversing cliques that exist on all the edge types in \(\varLambda (A)\) (line 3). This visit, that finds the maximal community containing A on \(\varLambda (A)\), is extended in two ways. First, lines 4–6 make sure that we do not produce communities that have already been found starting from another clique: if we encounter an already processed clique containing all the edge types in L(A), then this community must have been found while processing it. Second, lines 9–12 cover the case when during the visit we encounter a neighboring clique that only contains a subset \(L'\) of \(\varLambda (A)\). This means that there is a community containing A that is maximal on \(L'\). Consequently, for each \(L'\) we encounter during the process we will later call Algorithm 3 again to extract a maximal community containing A on those edge types (line 17–19). Line 16 makes sure that we do not process the same set \(L'\) more then once (notice that the elements of D are sets of edge types). The fact that the different \(L'\)’s correspond to cliques that we have encountered while visiting the adjacency matrix guarantees that we are not running the algorithm for all the subsets of L(A) but only for those for which a community exists.

Algorithm 2 uses Algorithm 3 to iterate over all cliques and finds all maximal communities containing the clique under examination for any \(m \ge \overline{m}\). Therefore, at the end it outputs all maximal communities.

1.4 A.4 Time Complexity

Finding maximal cliques is an NP-hard problem, that is, one for which even small datasets can take too long to be processed. In practice, as for its simple-graph version, the execution time of the multiplex clique percolation algorithm depends on the input data and in particular on the number and size of cliques.

Table 3 shows the algorithm’s execution time for real multiplex social networks of increasing size from an online repository (http://multilayer.it.uu.se), on a 2,4 GHz Intel Core i7 desktop computer with 16 GB RAM. We have repeated each execution 5 times, observing similar results (the mode is indicated in the table) and we have stopped the computation after 10 min. The drastic drop in execution time for the dkpol dataset (including follow, mention and retweet edge types from Twitter) when stricter constraints are used, and the execution times on the ff-tw-yt data, representing common users from three online social networks, highlight how the computation time strongly depends on the data structure and is not linearly dependent on the network size. At the same time, these experiments show the effect of the constraints on execution time.

Table 3. Execution time (s)

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Afsarmanesh Tehrani, N., Magnani, M. (2018). Partial and Overlapping Community Detection in Multiplex Social Networks. In: Staab, S., Koltsova, O., Ignatov, D. (eds) Social Informatics. SocInfo 2018. Lecture Notes in Computer Science(), vol 11186. Springer, Cham. https://doi.org/10.1007/978-3-030-01159-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-01159-8_2
Published: 20 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01158-1
Online ISBN: 978-3-030-01159-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics