Partial and Overlapping Community Detection in Multiplex Social Networks | SpringerLink
Skip to main content

Partial and Overlapping Community Detection in Multiplex Social Networks

  • Conference paper
  • First Online:
Social Informatics (SocInfo 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11186))

Included in the following conference series:

Abstract

We extend the popular clique percolation method to multiplex networks. Our extension requires to rethink the basic concepts on which the original clique percolation algorithm is based, including cliques and clique adjacency, to handle the presence of multiple types of ties. We also provide an experimental characterization of the communities that our method can identify.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Using the generalized Louvain method, every pair (vl) is included in exactly one community.

  2. 2.

    https://CRAN.R-project.org/package=multinet.

References

  1. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  2. Coscia, M., Giannotti, F., Pedreschi, D.: A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 4(5), 512–546 (2011)

    Article  MathSciNet  Google Scholar 

  3. Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)

    Article  Google Scholar 

  4. Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J.P., Moreno, Y., Porter, M.A.: Multiplex Networks. J. Complex Netw. 2(3), 203–271 (2014)

    Article  Google Scholar 

  5. Dickison, M.E., Magnani, M., Rossi, L.: Multiplex Social Networks. Cambridge University Press, Cambridge (2016)

    Book  Google Scholar 

  6. Sun, Y., Han, J.: Mining Heterogeneous Information Networks: Principles and Methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery. Morgan & Claypool Publishers, San Rafael (2012)

    Google Scholar 

  7. Bott, H.: Observation of play activities in a nursery school. Genet. Psychol. Monogr. 4, 44–88 (1928)

    Google Scholar 

  8. Moreno, J.L.: Who Shall Survive? A New Approach to the Problem of Human Interrelations. Nervous and Mental Disease Publishing Co., Washington, D. C. (1934)

    Book  Google Scholar 

  9. Xie, J., Kelley, S., Szymanski, B.K.: overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput. Surv. (CSUR) 45(4) (2013)

    Article  Google Scholar 

  10. Kumpula, J.M., Kivelä, M., Kaski, K., Saramäki, J.: Sequential algorithm for fast clique percolation. Phys. Rev. 78(2), 026109 (2008)

    Google Scholar 

  11. Yan, B., Gregory, S.: Detecting communities in networks by merging cliques. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2009, vol. 1, pp. 832–836. IEEE (2009)

    Google Scholar 

  12. Nepusz, T., Petróczi, A., Négyessy, L., Bazsó, F.: Fuzzy communities and the concept of bridgeness in complex networks. Phys. Rev. E 77(1), 016107 (2008)

    Article  MathSciNet  Google Scholar 

  13. Zhang, S., Wang, R.-S., Zhang, X.-S.: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys. A Stat. Mech. Its Appl. 374(1), 483–490 (2007)

    Article  Google Scholar 

  14. Ahn, Y.-Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010)

    Article  Google Scholar 

  15. Evans, T.S., Lambiotte, R.: Line graphs, link partitions, and overlapping communities. Phys. Rev. E 80(1), 016105 (2009)

    Article  Google Scholar 

  16. Bothorel, C., Cruz, J.D., Magnani, M., Micenkova, B.: Clustering attributed graphs: models, measures and methods. Netw. Sci. 3(3), 408–444 (2015)

    Article  Google Scholar 

  17. Berlingerio, M., Coscia, M., Giannotti, F.: Finding and characterizing communities in multidimensional networks. In: International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 490–494 (2011)

    Google Scholar 

  18. Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Community mining from multi-relational networks. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 445–452. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_44

    Chapter  Google Scholar 

  19. Rodriguez, M.A., Shinavier, J.: Exposing multi-relational networks to single-relational network analysis algorithms. J. Inf. 4(1), 29–41 (2010)

    Google Scholar 

  20. Tang, L., Wang, X., Liu, H.: Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov. 25(1), 1–33 (2012)

    Article  MathSciNet  Google Scholar 

  21. Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.-P.: Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–8 (2010)

    Article  MathSciNet  Google Scholar 

  22. Berlingerio, M., Pinelli, F., Calabrese, F.: ABACUS: frequent pAttern mining-BAsed Community discovery in mUltidimensional networkS. Data Min. Knowl. Discov. 27(3), 294320 (2013)

    Article  MathSciNet  Google Scholar 

  23. De Domenico, M., Lancichinetti, A., Arenas, A., Rosvall, M.: Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems. Phys. Rev. X 5(1), 11027 (2015)

    Google Scholar 

  24. Afsarmanesh, N., Magnani, M.: Finding overlapping communities in multiplex networks. https://arxiv.org/abs/1602.03746

  25. Magnani, M., Rossi, L.: The ML-model for multi-edge type social networks. In: International Conference on Social Network Analysis and Mining (ASONAM), pp. 5–12. IEEE Computer Society (2011)

    Google Scholar 

  26. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)

    Article  Google Scholar 

  27. Rossi, L., Magnani, M.: Towards effective visual analytics on multiplex and multilayer networks. Chaos Solitons Fractals 72, 68–76 (2015)

    Article  MathSciNet  Google Scholar 

  28. Leskovec, J., Lang, K.J., Mahoney, M.W., Dasgupta, A.: Statistical properties of community structure in large social and information networks. In: Proceeding of the 17th International Conference on World Wide Web - WWW 2008, p. 695 (2008)

    Google Scholar 

Download references

Acknowledgments

This work was partially funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 732027.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Magnani .

Editor information

Editors and Affiliations

A Algorithm

A Algorithm

The algorithm is divided into three parts, as in the original method: finding cliques, building the adjacency graph, and extracting communities.

1.1 A.1 Finding Cliques

Our algorithms starts by locating all maximal k-m-AND-cliques. Algorithm 1 is an extension of Bron–Kerbosch’s algorithm. It is a recursive algorithm where the recursive step takes a clique A as input and returns all maximal k-m-AND-cliques containing A that can be constructed using vertices in B, with \(k \ge \overline{k}\) and \(m \ge \overline{m}\). In this way, given a multiplex network \(M = (V,E,\mathcal {L})\), a call to \(\text {find-cliques}(\{\}, V, \{\}, \overline{k}, \overline{m})\) with \(\overline{k}>1\) and \(\overline{m}>0\) returns all maximal cliques in M with \(k \ge \overline{k}\) and \(m \ge \overline{m}\).

The algorithm works by updating two sets: one containing vertices that can be used to extend the currently processed clique, and one to keep track of already examined cliques. More precisely, the parameter B is a set of vertices such that, for every vertex \(n \in B\), \(A \cup \{n\}\) is a previously unseen clique on at least \(\overline{m}\) edge types. Whenever a vertex from B is used to extend the clique A, then B is updated by removing those vertices that are no longer connected to all vertices in the new clique \(A'\). C is the same as B, but containing those vertices that have already been examined by the algorithm during some previous iteration, so that no duplicates are produced. Given a set of vertices A we notate the set of edge types where the vertices in A form a clique as L(A), and the number of vertices in A as S(A). Therefore, if \(|L(A)|=0\) then A is not a clique on any edge type. We also define \(\max (\emptyset ) = 0\).

As an example, assume we call \(\text {find-cliques}(\{\}, \{0, 1, \dots , 9\}, \{\}, 3, 1)\) on the network in Fig. 3. The algorithm would then start exploring one of the vertices in B, let us say 5. The new call will thus include in B only those vertices that can still form a clique on at least one edge type when joined with 5: \(\text {find-cliques}(\{5\}, \{2, 3, 4, 6, 8\}, \{\}, 3, 1)\). Let us now assume that at the next iteration 3 is added to the current clique: \(\text {find-cliques}(\{3, 5\}, \{2, 6\}, \{\}, 3, 1)\), and then 6: \(\text {find-cliques}(\{3, 5, 6\}, \{2\}, \{\}, 3, 1)\). At this point S(A) is 3, satisfying the minimum clique size, and \(|L(\{3, 5, 6\})| = 2 > \max (\{|L(A \cup \{b\})| : b \in B\}) = \max (\{|L(\{2, 3, 5, 6\})|\}) = 1\). In fact, \(\{3, 5, 6\}\) is a clique on two edge types, while \(\{2, 3, 5, 6\}\) is a clique on only one edge type. Therefore, the current clique is returned as a maximal one (c4 in Fig. 3). At the next iteration, \(\text {find-cliques}(\{2, 3, 5, 6\}, \{\}, \{\}, 3, 1)\) is called and clique c3 is returned. At some later point, the algorithm would call \(\text {find-cliques}(\{2, 3, 5\}, \{\}, \{6\}, 3, 1)\), not returning any clique because \(C = \{6\}\) indicates that this path has already been explored, and \(\text {find-cliques}(\{4, 5\}, \{2\}, \{3\}, 3, 1)\), ultimately leading to the discovery of clique c1, and so on.

figure a

1.2 A.2 Clique-Adjacency Graph

In a simple graph, each clique is included in exactly one community, therefore, communities can be identified from a clique-clique overlap matrix (see [3] for the details). However, this statement is not necessarily true for k-m-AND-cliques and the corresponding communities. Because of the more complicated relations between cliques, instead of the overlap matrix used in the original method we generate an adjacency graph as the one represented in Fig. 3, where each vertex of the graph corresponds to a maximal clique and an edge between two vertices indicates that the corresponding cliques share at least k vertices and at least m edge types on all of their pairs of vertices. In the graph we have indicated for each vertex the edge types where the corresponding clique is defined. In the following we refer to this graph as clique-adjacency graph.

1.3 A.3 From the Clique-Adjacency Graph to Communities

As previously mentioned, each clique can be included in different communities with different combinations of its adjacent cliques. Here our objective is using the adjacency graph and the information regarding the edge labels simultaneously to find communities in the Multiplex network. Two cliques can be included in at least one community if: (1) there exists a path between the corresponding vertices in the clique-adjacency graph, and (2) for all vertices in the path the corresponding cliques share at least m edge types on all of their pairs. We call the latter rule the cliques’ constraint, and we call a tree maximal if no other adjacent clique can be added without reducing the maximal m for which the cliques’ constraint is satisfied. Therefore, each community in the original network corresponds to a maximal tree in the clique-adjacency graph where the cliques’ constraint holds for all vertices in the tree. So the problem is equivalent to recognizing all such maximal trees in the graph.

Algorithm 3 takes a community A as input and returns all maximal communities containing A. In this algorithm, given a set of cliques (that is, a community) A we notate the set of edge types we are currently considering to find a maximal community as \(\varLambda (A)\). Notice that \(\varLambda (A) \subseteq L(A)\): we are going to run this algorithm multiple times starting from the same clique with different \(\varLambda (A)\)’s, that is, looking for communities in different sets of edge types. We also define N(c) as the neighbors of c in the adjacency graph, as before.

Algorithm 3 consists of two phases. First, it finds a maximal community on \(\varLambda (A)\) (lines 1–15). Then, it recursively does the same for all subsets of \(\varLambda (A)\) for which larger communities can be found (lines 17–20). B contains the set of cliques that may be used to extend the community A.

figure b

To understand the algorithm we invite the reader to start inspecting lines 1–3, 7–8, and 13–15: this is just a simple depth-first visit of the adjacency matrix, finding the largest connected component containing A only traversing cliques that exist on all the edge types in \(\varLambda (A)\) (line 3). This visit, that finds the maximal community containing A on \(\varLambda (A)\), is extended in two ways. First, lines 4–6 make sure that we do not produce communities that have already been found starting from another clique: if we encounter an already processed clique containing all the edge types in L(A), then this community must have been found while processing it. Second, lines 9–12 cover the case when during the visit we encounter a neighboring clique that only contains a subset \(L'\) of \(\varLambda (A)\). This means that there is a community containing A that is maximal on \(L'\). Consequently, for each \(L'\) we encounter during the process we will later call Algorithm 3 again to extract a maximal community containing A on those edge types (line 17–19). Line 16 makes sure that we do not process the same set \(L'\) more then once (notice that the elements of D are sets of edge types). The fact that the different \(L'\)’s correspond to cliques that we have encountered while visiting the adjacency matrix guarantees that we are not running the algorithm for all the subsets of L(A) but only for those for which a community exists.

Algorithm 2 uses Algorithm 3 to iterate over all cliques and finds all maximal communities containing the clique under examination for any \(m \ge \overline{m}\). Therefore, at the end it outputs all maximal communities.

figure c

1.4 A.4 Time Complexity

Finding maximal cliques is an NP-hard problem, that is, one for which even small datasets can take too long to be processed. In practice, as for its simple-graph version, the execution time of the multiplex clique percolation algorithm depends on the input data and in particular on the number and size of cliques.

Table 3 shows the algorithm’s execution time for real multiplex social networks of increasing size from an online repository (http://multilayer.it.uu.se), on a 2,4 GHz Intel Core i7 desktop computer with 16 GB RAM. We have repeated each execution 5 times, observing similar results (the mode is indicated in the table) and we have stopped the computation after 10 min. The drastic drop in execution time for the dkpol dataset (including follow, mention and retweet edge types from Twitter) when stricter constraints are used, and the execution times on the ff-tw-yt data, representing common users from three online social networks, highlight how the computation time strongly depends on the data structure and is not linearly dependent on the network size. At the same time, these experiments show the effect of the constraints on execution time.

Table 3. Execution time (s)

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Afsarmanesh Tehrani, N., Magnani, M. (2018). Partial and Overlapping Community Detection in Multiplex Social Networks. In: Staab, S., Koltsova, O., Ignatov, D. (eds) Social Informatics. SocInfo 2018. Lecture Notes in Computer Science(), vol 11186. Springer, Cham. https://doi.org/10.1007/978-3-030-01159-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01159-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01158-1

  • Online ISBN: 978-3-030-01159-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics