Abstract
We extend the popular clique percolation method to multiplex networks. Our extension requires to rethink the basic concepts on which the original clique percolation algorithm is based, including cliques and clique adjacency, to handle the presence of multiple types of ties. We also provide an experimental characterization of the communities that our method can identify.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Using the generalized Louvain method, every pair (v, l) is included in exactly one community.
- 2.
References
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Coscia, M., Giannotti, F., Pedreschi, D.: A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 4(5), 512–546 (2011)
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)
Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J.P., Moreno, Y., Porter, M.A.: Multiplex Networks. J. Complex Netw. 2(3), 203–271 (2014)
Dickison, M.E., Magnani, M., Rossi, L.: Multiplex Social Networks. Cambridge University Press, Cambridge (2016)
Sun, Y., Han, J.: Mining Heterogeneous Information Networks: Principles and Methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery. Morgan & Claypool Publishers, San Rafael (2012)
Bott, H.: Observation of play activities in a nursery school. Genet. Psychol. Monogr. 4, 44–88 (1928)
Moreno, J.L.: Who Shall Survive? A New Approach to the Problem of Human Interrelations. Nervous and Mental Disease Publishing Co., Washington, D. C. (1934)
Xie, J., Kelley, S., Szymanski, B.K.: overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput. Surv. (CSUR) 45(4) (2013)
Kumpula, J.M., Kivelä, M., Kaski, K., Saramäki, J.: Sequential algorithm for fast clique percolation. Phys. Rev. 78(2), 026109 (2008)
Yan, B., Gregory, S.: Detecting communities in networks by merging cliques. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2009, vol. 1, pp. 832–836. IEEE (2009)
Nepusz, T., Petróczi, A., Négyessy, L., Bazsó, F.: Fuzzy communities and the concept of bridgeness in complex networks. Phys. Rev. E 77(1), 016107 (2008)
Zhang, S., Wang, R.-S., Zhang, X.-S.: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys. A Stat. Mech. Its Appl. 374(1), 483–490 (2007)
Ahn, Y.-Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010)
Evans, T.S., Lambiotte, R.: Line graphs, link partitions, and overlapping communities. Phys. Rev. E 80(1), 016105 (2009)
Bothorel, C., Cruz, J.D., Magnani, M., Micenkova, B.: Clustering attributed graphs: models, measures and methods. Netw. Sci. 3(3), 408–444 (2015)
Berlingerio, M., Coscia, M., Giannotti, F.: Finding and characterizing communities in multidimensional networks. In: International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 490–494 (2011)
Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Community mining from multi-relational networks. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 445–452. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_44
Rodriguez, M.A., Shinavier, J.: Exposing multi-relational networks to single-relational network analysis algorithms. J. Inf. 4(1), 29–41 (2010)
Tang, L., Wang, X., Liu, H.: Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov. 25(1), 1–33 (2012)
Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.-P.: Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–8 (2010)
Berlingerio, M., Pinelli, F., Calabrese, F.: ABACUS: frequent pAttern mining-BAsed Community discovery in mUltidimensional networkS. Data Min. Knowl. Discov. 27(3), 294320 (2013)
De Domenico, M., Lancichinetti, A., Arenas, A., Rosvall, M.: Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems. Phys. Rev. X 5(1), 11027 (2015)
Afsarmanesh, N., Magnani, M.: Finding overlapping communities in multiplex networks. https://arxiv.org/abs/1602.03746
Magnani, M., Rossi, L.: The ML-model for multi-edge type social networks. In: International Conference on Social Network Analysis and Mining (ASONAM), pp. 5–12. IEEE Computer Society (2011)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)
Rossi, L., Magnani, M.: Towards effective visual analytics on multiplex and multilayer networks. Chaos Solitons Fractals 72, 68–76 (2015)
Leskovec, J., Lang, K.J., Mahoney, M.W., Dasgupta, A.: Statistical properties of community structure in large social and information networks. In: Proceeding of the 17th International Conference on World Wide Web - WWW 2008, p. 695 (2008)
Acknowledgments
This work was partially funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 732027.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Algorithm
A Algorithm
The algorithm is divided into three parts, as in the original method: finding cliques, building the adjacency graph, and extracting communities.
1.1 A.1 Finding Cliques
Our algorithms starts by locating all maximal k-m-AND-cliques. Algorithm 1 is an extension of Bron–Kerbosch’s algorithm. It is a recursive algorithm where the recursive step takes a clique A as input and returns all maximal k-m-AND-cliques containing A that can be constructed using vertices in B, with \(k \ge \overline{k}\) and \(m \ge \overline{m}\). In this way, given a multiplex network \(M = (V,E,\mathcal {L})\), a call to \(\text {find-cliques}(\{\}, V, \{\}, \overline{k}, \overline{m})\) with \(\overline{k}>1\) and \(\overline{m}>0\) returns all maximal cliques in M with \(k \ge \overline{k}\) and \(m \ge \overline{m}\).
The algorithm works by updating two sets: one containing vertices that can be used to extend the currently processed clique, and one to keep track of already examined cliques. More precisely, the parameter B is a set of vertices such that, for every vertex \(n \in B\), \(A \cup \{n\}\) is a previously unseen clique on at least \(\overline{m}\) edge types. Whenever a vertex from B is used to extend the clique A, then B is updated by removing those vertices that are no longer connected to all vertices in the new clique \(A'\). C is the same as B, but containing those vertices that have already been examined by the algorithm during some previous iteration, so that no duplicates are produced. Given a set of vertices A we notate the set of edge types where the vertices in A form a clique as L(A), and the number of vertices in A as S(A). Therefore, if \(|L(A)|=0\) then A is not a clique on any edge type. We also define \(\max (\emptyset ) = 0\).
As an example, assume we call \(\text {find-cliques}(\{\}, \{0, 1, \dots , 9\}, \{\}, 3, 1)\) on the network in Fig. 3. The algorithm would then start exploring one of the vertices in B, let us say 5. The new call will thus include in B only those vertices that can still form a clique on at least one edge type when joined with 5: \(\text {find-cliques}(\{5\}, \{2, 3, 4, 6, 8\}, \{\}, 3, 1)\). Let us now assume that at the next iteration 3 is added to the current clique: \(\text {find-cliques}(\{3, 5\}, \{2, 6\}, \{\}, 3, 1)\), and then 6: \(\text {find-cliques}(\{3, 5, 6\}, \{2\}, \{\}, 3, 1)\). At this point S(A) is 3, satisfying the minimum clique size, and \(|L(\{3, 5, 6\})| = 2 > \max (\{|L(A \cup \{b\})| : b \in B\}) = \max (\{|L(\{2, 3, 5, 6\})|\}) = 1\). In fact, \(\{3, 5, 6\}\) is a clique on two edge types, while \(\{2, 3, 5, 6\}\) is a clique on only one edge type. Therefore, the current clique is returned as a maximal one (c4 in Fig. 3). At the next iteration, \(\text {find-cliques}(\{2, 3, 5, 6\}, \{\}, \{\}, 3, 1)\) is called and clique c3 is returned. At some later point, the algorithm would call \(\text {find-cliques}(\{2, 3, 5\}, \{\}, \{6\}, 3, 1)\), not returning any clique because \(C = \{6\}\) indicates that this path has already been explored, and \(\text {find-cliques}(\{4, 5\}, \{2\}, \{3\}, 3, 1)\), ultimately leading to the discovery of clique c1, and so on.
1.2 A.2 Clique-Adjacency Graph
In a simple graph, each clique is included in exactly one community, therefore, communities can be identified from a clique-clique overlap matrix (see [3] for the details). However, this statement is not necessarily true for k-m-AND-cliques and the corresponding communities. Because of the more complicated relations between cliques, instead of the overlap matrix used in the original method we generate an adjacency graph as the one represented in Fig. 3, where each vertex of the graph corresponds to a maximal clique and an edge between two vertices indicates that the corresponding cliques share at least k vertices and at least m edge types on all of their pairs of vertices. In the graph we have indicated for each vertex the edge types where the corresponding clique is defined. In the following we refer to this graph as clique-adjacency graph.
1.3 A.3 From the Clique-Adjacency Graph to Communities
As previously mentioned, each clique can be included in different communities with different combinations of its adjacent cliques. Here our objective is using the adjacency graph and the information regarding the edge labels simultaneously to find communities in the Multiplex network. Two cliques can be included in at least one community if: (1) there exists a path between the corresponding vertices in the clique-adjacency graph, and (2) for all vertices in the path the corresponding cliques share at least m edge types on all of their pairs. We call the latter rule the cliques’ constraint, and we call a tree maximal if no other adjacent clique can be added without reducing the maximal m for which the cliques’ constraint is satisfied. Therefore, each community in the original network corresponds to a maximal tree in the clique-adjacency graph where the cliques’ constraint holds for all vertices in the tree. So the problem is equivalent to recognizing all such maximal trees in the graph.
Algorithm 3 takes a community A as input and returns all maximal communities containing A. In this algorithm, given a set of cliques (that is, a community) A we notate the set of edge types we are currently considering to find a maximal community as \(\varLambda (A)\). Notice that \(\varLambda (A) \subseteq L(A)\): we are going to run this algorithm multiple times starting from the same clique with different \(\varLambda (A)\)’s, that is, looking for communities in different sets of edge types. We also define N(c) as the neighbors of c in the adjacency graph, as before.
Algorithm 3 consists of two phases. First, it finds a maximal community on \(\varLambda (A)\) (lines 1–15). Then, it recursively does the same for all subsets of \(\varLambda (A)\) for which larger communities can be found (lines 17–20). B contains the set of cliques that may be used to extend the community A.
To understand the algorithm we invite the reader to start inspecting lines 1–3, 7–8, and 13–15: this is just a simple depth-first visit of the adjacency matrix, finding the largest connected component containing A only traversing cliques that exist on all the edge types in \(\varLambda (A)\) (line 3). This visit, that finds the maximal community containing A on \(\varLambda (A)\), is extended in two ways. First, lines 4–6 make sure that we do not produce communities that have already been found starting from another clique: if we encounter an already processed clique containing all the edge types in L(A), then this community must have been found while processing it. Second, lines 9–12 cover the case when during the visit we encounter a neighboring clique that only contains a subset \(L'\) of \(\varLambda (A)\). This means that there is a community containing A that is maximal on \(L'\). Consequently, for each \(L'\) we encounter during the process we will later call Algorithm 3 again to extract a maximal community containing A on those edge types (line 17–19). Line 16 makes sure that we do not process the same set \(L'\) more then once (notice that the elements of D are sets of edge types). The fact that the different \(L'\)’s correspond to cliques that we have encountered while visiting the adjacency matrix guarantees that we are not running the algorithm for all the subsets of L(A) but only for those for which a community exists.
Algorithm 2 uses Algorithm 3 to iterate over all cliques and finds all maximal communities containing the clique under examination for any \(m \ge \overline{m}\). Therefore, at the end it outputs all maximal communities.
1.4 A.4 Time Complexity
Finding maximal cliques is an NP-hard problem, that is, one for which even small datasets can take too long to be processed. In practice, as for its simple-graph version, the execution time of the multiplex clique percolation algorithm depends on the input data and in particular on the number and size of cliques.
Table 3 shows the algorithm’s execution time for real multiplex social networks of increasing size from an online repository (http://multilayer.it.uu.se), on a 2,4 GHz Intel Core i7 desktop computer with 16 GB RAM. We have repeated each execution 5 times, observing similar results (the mode is indicated in the table) and we have stopped the computation after 10 min. The drastic drop in execution time for the dkpol dataset (including follow, mention and retweet edge types from Twitter) when stricter constraints are used, and the execution times on the ff-tw-yt data, representing common users from three online social networks, highlight how the computation time strongly depends on the data structure and is not linearly dependent on the network size. At the same time, these experiments show the effect of the constraints on execution time.
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Afsarmanesh Tehrani, N., Magnani, M. (2018). Partial and Overlapping Community Detection in Multiplex Social Networks. In: Staab, S., Koltsova, O., Ignatov, D. (eds) Social Informatics. SocInfo 2018. Lecture Notes in Computer Science(), vol 11186. Springer, Cham. https://doi.org/10.1007/978-3-030-01159-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-01159-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01158-1
Online ISBN: 978-3-030-01159-8
eBook Packages: Computer ScienceComputer Science (R0)