Abstract
Hypergraphs are a powerful abstraction for modeling high-order relations, which are ubiquitous in many fields. A hypergraph consists of nodes and hyperedges (i.e., subsets of nodes); and there have been a number of attempts to extend the notion of \({\varvec{k}}\)-cores, which proved useful with numerous applications for pairwise graphs, to hypergraphs. However, the previous extensions are based on an unrealistic assumption that hyperedges are fragile, i.e., a high-order relation becomes obsolete as soon as a single member leaves it.In this work, we propose a new substructure model, called \({\varvec{(k,t)}}\)-hypercore, based on the assumption that high-order relations remain as long as at least t fraction of the members remains. Specifically, it is defined as the maximal subhypergraph where (1) every node is contained in at least \({\varvec{k}}\) hyperedges in it and (2) at least \({\varvec{t}}\) fraction of the nodes remain in every hyperedge. We first prove that, given \({\varvec{t}}\) (or \({\varvec{k}}\)), finding the \({\varvec{(k,t)}}\)-hypercore for every possible \({\varvec{k}}\) (or \({\varvec{t}}\)) can be computed in time linear w.r.t the sum of the sizes of hyperedges. Then, we demonstrate that real-world hypergraphs from the same domain share similar \({\varvec{(k,t)}}\)-hypercore structures, which capture different perspectives depending on \({\varvec{t}}\). Lastly, we show the successful applications of our model in identifying influential nodes, dense substructures, and vulnerability in hypergraphs.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
A multiset is a set al.lowing duplicate elements.
In this work, the maximal subgraph (subhypergraph) satisfying some conditions means that every other graph (hypergraph) satisfying such conditions is a subgraph (subhypergraph) of the maximal one.
Similar to clique expansion, we can also have weighted star expansion, which is, however, not used in this work.
We assume that the input hypergraph is in the memory and thus do not count the complexity of loading the hypergraph, which is \(O(\sum _{e \in E} |e|)\).
Recall that \(\ell\)-hypercoreness with \(\ell = 2\) is include in t-hypercoreness with \(t = 0\). For each dataset, we apply min-max normalization to all the possible \(\ell\) values with \(\ell \ge 3\) so that t-hypercoreness and \(\ell\)-hypercoreness can fit in the same x-axis with the range [0, 1].
The average performance over five independent trials is reported.
We count \(\ell\)-hypercoreness with each \(\ell\) value as a separate method (\(\ell = 2\) is not counted since it is already included in the concept of t-hypercoreness with \(t = 0\)).
Simplicial complexes can be seen as a special class of hypergraphs.
The case \(\ell = 2\) is included in the proposed concept of t-hypercoreness with \(t = 0\).
References
Adamic LA, Lukose RM, Puniyani AR et al. (2001) Search in power-law networks. Phys Rev E 64(4):046–135
Albert R, Barabási AL (2002) Statistical mechanics of complex networks. Rev Mod Phys 74(1):47
Alvarez-Hamelin JI, Dall’Asta L, Barrat A, et al. (2006) Large scale networks fingerprinting and visualization using the k-core decomposition. In: NeurIPS
Alvarez-Hamelin JI, Dall’Asta L, Barrat A et al. (2008) K-core decomposition of internet graphs: hierarchies, self-similarity and measurement biases. Netw Heterog Media 3(2):371
Antelmi A, Cordasco G, Spagnuolo C et al. (2021) Social influence maximization in hypergraphs. Entropy 23(7):796
Arafat NA, Khan A, Rai AK, et al. (2023) Neighborhood-based hypergraph core decomposition. PVLDB 16
Arya D, Gupta DK, Rudinac S, et al. (2020) Hypersage: Generalizing inductive representation learning on hypergraphs. arXiv preprint arXiv:2010.04558
Bai S, Zhang F, Torr PH (2021) Hypergraph convolution and hypergraph attention. Pattern Recogn 110(107):637
Batagelj V, Zaversnik M (2003) An \(o(m)\) algorithm for cores decomposition of networks. In: arXiv
Benson AR (2019) Three hypergraph eigenvector centralities. SIAM J Math Data Sci 1(2):293–312
Benson AR, Abebe R, Schaub MT et al. (2018) Simplicial closure and higher-order link prediction. PNAS 115(48):E11221–E11230
Benson AR, Kumar R, Tomkins A (2018b) Sequences of sets. In: KDD
Blanco R, Lioma C (2012) Graph-based term weighting for information retrieval. Inf Retr 15(1):54–92
Bodó Á, Katona GY, Simon PL (2016) Sis epidemic propagation on hypergraphs. Bull Math Biol 78(4):713–735
Bonacich P, Lloyd P (2001) Eigenvector-like measures of centrality for asymmetric relations. Social Netw 23(3):191–201
Bonchi F, Khan A, Severini L (2019) Distance-generalized core decomposition. In: SIGMOD
Bu F, Lee G, Shin K (2023) Code, datasets, and supplementary materials. https://github.com/bokveizen/non-fragile-hypercore
Chein M, Mugnier ML (2008) Graph-based knowledge representation: computational foundations of conceptual graphs. Springer
Chen Z, Yuan L, Han L, et al. (2021) Higher-order truss decomposition in graphs. In: TKDE
Chien E, Pan C, Peng J, et al. (2021) You are allset: a multiset function framework for hypergraph neural networks. arXiv preprint arXiv:2106.13264
Corominas-Murtra B, Fuchs B, Thurner S (2014) Detection of the elite structure in a virtual multiplex social system by means of a generalised k-core. PLoS ONE 9(12):e11,2606
Cui H, Lu Z, Li P, et al. (2022) On positional and structural node features for graph neural networks on non-attributed graphs. In: CIKM
Dai Q, Li RH, Qin L, et al. (2021) Scaling up distance-generalized core decomposition. In: CIKM
Debnath S, Ganguly N, Mitra P (2008) Feature weighting in content based recommendation system using social network analysis. In: WWW
Do MT, Yoon Se, Hooi B, et al. (2020) Structural patterns and generative models of real-world hypergraphs. In: KDD
Feng Y, You H, Zhang Z, et al. (2019) Hypergraph neural networks. In: AAAI
Gabert K, Pinar A, Çatalyürek ÜV (2021a) Shared-memory scalable k-core maintenance on dynamic graphs and hypergraphs. In: IPDPSW
Gabert K, Pinar A, Çatalyürek ÜV (2021b) A unifying framework to identify dense subgraphs on streams: Graph nuclei to hypergraph cores. In: WSDM
Gao Y, Feng Y, Ji S, et al. (2022) Hgnn\(^{+}\): General hypergraph neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 855–864
Han Z, Zheng X, Chen C, et al. (2023) Intra and inter domain hypergraph convolutional network for cross-domain recommendation. In: WWW
He T, Ong YS, Bai L (2021) Learning conjoint attentions for graph neural nets. In: NeurIPS
Hua QS, Zhang X, Jin H et al. (2023) Revisiting core maintenance for dynamic hypergraphs. IEEE Trans Parallel Distrib Syst 34:981–994
Huang J, Yang J (2021) Unignn: a unified framework for graph and hypergraph neural networks. arXiv preprint arXiv:2105.00956
Jiang J, Wei Y, Feng Y, et al. (2019) Dynamic hypergraph neural networks. In: IJCAI
Kim H, Ko J, Bu F, et al. (2023) Characterization of simplicial complexes by counting simplets beyond four nodes. In: WWW
Kim J, Oh S, Cho S, et al. (2022) Equivariant hypergraph neural networks. In: ECCV
Kitsak M, Gallos LK, Havlin S et al. (2010) Identification of influential spreaders in complex networks. Nat Phys 6(11):888–893
Ko J, Kook Y, Shin K (2022) Growth patterns and models of real-world hypergraphs. KAIS 64(11):2883–2920
Konstantinova EV, Skorobogatov VA (2001) Application of hypergraph theory in chemistry. Discret Math 235(1–3):365–383
Lee D, Shin K (2023) I’m me, we’re us, and i’m us: Tri-directional contrastive learning on hypergraphs. In: AAAI
Lee G, Shin K (2021) Thyme+: Temporal hypergraph motifs and fast algorithms for exact counting. In: ICDM
Lee G, Ko J, Shin K (2020) Hypergraph motifs: concepts, algorithms, and discoveries. PVLDB 13(11):2256–2269
Lee G, Choe M, Shin K (2021) How do hyperedges overlap in real-world hypergraphs? - patterns, measures, and generators. In: WWW
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: Densification and shrinking diameters. TKDD 1(1):2–es
Li P, Wang H, Li K, et al. (2023) Influence without authority: Maximizing information coverage in hypergraphs. In: SDM
Liao X, Xu Y, Ling H (2021) Hypergraph neural networks for hypergraph matching. In: ICCV
Limnios S, Dasoulas G, Thilikos DM, et al. (2021) Hcore-init: Neural network initialization based on graph degeneracy. In: ICPR
Liu B, Yuan L, Lin X et al. (2020) Efficient (\(\alpha\), \(\beta\))-core computation in bipartite graphs. VLDB J 29(5):1075–1099
Lotito QF, Musciotto F, Montresor A et al. (2022) Higher-order motif analysis in hypergraphs. Commun Phys 5(1):79
Lu Z, Zhu Y, Zhong M, et al. (2022) On time-optimal (k, p)-core community search in dynamic graphs. In: ICDE
Luo F, Li B, Wan XF, et al. (2009) Core and periphery structures in protein interaction networks. In: BMC bioinformatics
Luo Q, Yu D, Cai Z, et al. (2021) Hypercore maintenance in dynamic hypergraphs. In: ICDE
Luo Q, Yu D, Cai Z et al. (2022) Toward maintenance of hypercores in large-scale dynamic hypergraphs. VLDB J 32:1–18
Malliaros FD, Giatsidis C, Papadopoulos AN et al. (2020) The core decomposition of networks: theory, algorithms and applications. VLDB J 29:61–92
Mastrandrea R, Fournet J, Barrat A (2015) Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PloS one 10(9):e0136497
McGlohon M, Akoglu L, Faloutsos C (2008) Weighted graphs and disconnected components: patterns and a generator. In: KDD
Mihalcea R, Radev D (2011) Graph-based Natural Language Processing and Information Retrieval. Cambridge University Press
Peng C, Kolda TG, Pinar A (2014) Accelerating community detection by using k-core subgraphs. In: arXiv
Peng Y, Zhang Y, Zhang W, et al. (2018) Efficient probabilistic k-core computation on uncertain graphs. In: ICDE
Preti G, De Francisci Morales G, Bonchi F (2021) Strud: Truss decomposition of simplicial complexes. In: WWW
Qu C, Tao M, Yuan R (2018) A hypergraph-based blockchain model and application in internet of things-enabled smart homes. Sensors 18(9):2784
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Rossi MEG, Malliaros FD, Vazirgiannis M (2015) Spread it good, spread it fast: Identification of influential nodes in social networks. In: WWW
Sarıyüce AE, Pinar A (2018) Peeling bipartite networks for dense subgraph discovery. In: WSDM
Seidman SB (1983) Network structure and minimum degree. Social Netw 5(3):269–287
Shin K, Eliassi-Rad T, Faloutsos C (2018) Patterns and anomalies in k-cores of real-world graphs with applications. KAIS 54(3):677–710
Shin K, Hooi B, Faloutsos C (2018) Fast, accurate, and flexible algorithms for dense subtensor mining. TKDD 12(3):1–30
Silva NB, Tsang R, Cavalcanti GD, et al. (2010) A graph-based friend recommendation system using genetic algorithm. In: CEC
Sinha A, Shen Z, Song Y, et al. (2015) An overview of microsoft academic service (mas) and applications. In: WWW
Sun B, Chan THH, Sozio M (2020) Fully dynamic approximate k-core decomposition in hypergraphs. TKDD 14(4):1–21
Torres L, Blevins AS, Bassett DS et al. (2021) The why, how, and when of representations for complex systems. SIAM Rev 63:435–485
Tudisco F, Higham DJ (2021) Node and edge nonlinear eigenvector centrality for hypergraphs. Commun Phys 4(1):1–10
Victor F, Akcora CG, Gel YR, et al. (2021) Alphacore: data depth based core decomposition. In: KDD
Vogiatzis D (2013) Influence study on hyper-graphs. In: AAAI Symposia
Wang K, Cao X, Lin X, et al. (2018) Efficient computing of radius-bounded k-cores. In: ICDE
Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440–442
Wood CI, Hicks IV (2015) The minimal k-core problem for modeling k-assemblies. J Math Neurosci 5(1):1–19
Wu T, Ling Q (2023) Self-supervised heterogeneous hypergraph network for knowledge tracing. Inf Sci 624:200–216
Xia L, Huang C, Xu Y, et al. (2022) Hypergraph contrastive collaborative filtering. In: SIGIR
Xie M, Zhan X, Liu C, et al. (2023) An efficient adaptive degree-based heuristic algorithm for influence maximization in hypergraphs. Inf Process Manage 60(2):103161
Yang C, Wang R, Yao S, et al. (2022) Semi-supervised hypergraph node classification on hypergraph line expansion. In: CIKM
Yin H, Benson AR, Leskovec J, et al. (2017) Local higher-order graph clustering. In: KDD
Zhang C, Zhang F, Zhang W, et al. (2020) Exploring finer granularity within the cores: efficient (k, p)-core computation. In: ICDE
Zhang F, Zhang Y, Qin L, et al. (2017a) Finding critical users for social network engagement: the collapsed k-core problem. In: AAAI
Zhang F, Zhang Y, Qin L, et al. (2017b) When engagement meets similarity: efficient (k, r)-core computation on social networks. In: PVLDB
Zhang Y, Parthasarathy S (2012) Extracting analyzing and visualizing triangle k-core motifs within networks. In: ICDE
Zhu J, Zhu J, Ghosh S et al. (2018) Social influence maximization in hypergraph in social networks. TNSE 6(4):801–811
Zhu W, Chen C, Wang X, et al. (2018b) K-core minimization: an edge manipulation approach. In: CIKM
Zhu W, Zhang M, Chen C, et al. (2019) Pivotal relationship identification: the k-truss minimization problem. In: IJCAI
Zien JY, Schlag MD, Chan PK (1999) Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Trans Comput Aided Des Integr Circuits Syst 18(9):1389–1399
Funding
This work was supported by National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2020R1C1C1008296) and Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2022-0-00871, Development of AI Autonomy and Knowledge Enhancement for AI Agent Collaboration) (No. 2019-0-00075, Artificial Intelligence Graduate School Program (KAIST)).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Responsible editor: Charalampos Tsourakakis.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
A. Proofs
1.1 A.1 Proof of proposition 1
Proof
Since H is finite, the number of subhypergraphs of H is also finite. Therefore, there exists one subhypergraph with maximal total size (which is possibly an empty hypergraph) where each node has degree at least k and at least t proportion of the constituent nodes remain in each hyperedge, completing the proof of existence. To show the uniqueness, suppose the opposite, and let \(C^1 = (V^1, E^1)\) and \(C^2 = (V^2, E^2)\) be two distinct (k, t)-hypercores of H. Then we consider the hypergraph \(C' = (V', E')\) with \(E' = \{e^1_i \cup e^2_i: i \in I_{E^1} \cup I_{E^2}\}\). Clearly, \(C'\) is a subhypergraph of H with a larger total size that satisfies the node-degree and hyperedge-fraction conditions, which contradicts the maximality and completes the proof. \(\square\)
1.2 A.2 Proof of proposition 2
Proof
Suppose that \(C_{k, t_2}(H)\) is not a subhypergraph of \(C_{k, t_1}(H)\). Then we take the union \(C_{k, t_2}(H) \cup C_{k, t_1}(H)\) and we obtain a hypergraph that is strictly larger than \(C_{k, t_1}(H)\) and satisfies the conditions of \((k, t_1)\)-hypercore, which contradicts with the maximality, completing the proof. The second statement can be proved similarly. \(\square\)
1.3 A.3 Proof of lemma 1
Proof
This equivalence is immediate by two facts. First, for each node \(v \in V\), the degree of v in H is equal to the degree of v in \(G_{bp}(H)\). Second, for each hyperedge \(e \in E\), the number of nodes in e is equal to the degree of e in \(G_{bp}(H)\). With the above two facts, this equivalence immediately follows. \(\square\)
1.4 A.4 Proof of lemma 2
Proof
By Definition 5, when \(t = 0\), the definition of \(C_{k; t = 0}(H)\) is the maximal subhypergraph of H where (1) every node in \(C_{k, t=0}(H)\) has degree at least k and (2) at least two nodes remain in every hyperedge of \(C_{k, t=0}(H)\). Such a definition exactly coincides with \({\tilde{C}}_{k; \ell = 2}(H)\), completing the proof. \(\square\)
1.5 A.5 Proof of lemma 3
We can understand the differences between \((k; \ell )\)-hypercores and (k, t)-hypercores by two intuitions. When we obtain the \((k; \ell )\)-hypercore of a given H with \(\ell > 2\), all hyperedges of cardinality 2 are removed in the first place. Therefore, if we want to find an \(\ell\) such that \({\tilde{C}}_{k; \ell } = C_{k, t}\) where \(C_{k, t}\) contains any hyperedge of cardinality 2, the only possible \(\ell\) value is \(\ell = 2\). Since the threshold in the (k, t)-hypercore is proportional, it imposes different absolute cardinality thresholds for hyperedges of different sizes. On the contrary, the \((k; \ell )\)-hypercore imposes the same absolute cardinality threshold for each hyperedge. We shall show two counterexamples from the two intuitions above.
Proof
Consider \(H = (V, E)\) with \(E = \{\{1, 2\}, \{1, 3\} \{1, 2, 3, 4\}, \{1, 3, 4, 5, 6\}\}.\) The \((k=2, t=3/4)\)-hypercore of H consists of the hyperedges \(\{\{1,2\}, \{1,3\}, \{1, 2, 3\}\},\) where the hyperedge \(\{1,3,4,5,6\}\) is totally removed since only \(3/5 < t = 3/4\) of the constituent nodes remain by the node-degree threshold \(k = 2\). For the \((k; \ell )\)-hypercore, the \((k=2; \ell =2)\)-hypercore of H consists of the hyperedges \(\{\{1, 2\}, \{1, 3\}, \{1, 2, 3, 4\}, \{1, 3, 4\}\}\); the \((k=2; \ell =3)\)-hypercore of H consists of the hyperedges \(\{\{1, 3, 4\}, \{1, 3, 4\}\}\); when \(\ell \ge 4\), the \((k=2;\ell )\)-hypercore of H is empty, completing the proof.
We show another counterexample. Consider \(H = (V, E)\) with \(E = \{\{1,2,3,4\}, \{1,2,5,6\}, \{5,6,7,8\},\{3,4,9,10,11\},\{1,2,3,4,5,6,7,8\}\}.\) The \((k=3, t=1/2)\)-hypercore of H consists of the hyperedges \(\{\{1,2\}, \{1,2,5,6\}, \{5,6\}, \{1,2,5,6\}\}.\) For the \((k; \ell )\)-hypercore, when \(\ell = 2\), the \((k=3; \ell =2)\)-hypercore of H consists of the hyperedges \(\{\{1,2,3,4\},\{1,2,5,6\},\{5,6\},\{3,4\},\{1,2,3,4,5,6\}\};\) when \(\ell \ge 3\), the \((k=3; \ell )\)-hypercore of H is empty, completing the proof. \(\square\)
Remark 1
Our proposed (k, t)-hypercore allows arbitrarily fine-grained adjustment since the value of t is continuous in [0, 1], while \(\ell\) must be an integer. In real-world hypergraphs, many hyperedges are of cardinality 2. Therefore, the \((k; \ell )\)-hypercore with \(\ell > 2\) is significantly less meaningful than the (k, t)-hypercore since many hyperedges are not taken into consideration at all. See Tbl. 5 for the detailed number of hyperedges of different cardinality in each dataset we have used. See Figs. 8 and 9 for the performance of hypercoreness w.r.t \((k; \ell )\)-hypercore to indicate the influence of nodes. Note again that the \((k; \ell =2)\)-hypercore is included in our proposed concept as the \((k, t=0)\)-hypercore. We observe that in most datasets, the \((k; \ell )\)-hypercores become less meaningful and fail to indicate the influence of nodes when \(\ell\) becomes large, as expected.
1.6 A.6 Proof of theorem 1
Proof
Correctness. The size of a hyperedge changes only when some node in \({\mathcal {R}}\) is removed from it, and the degree of a node changes only when some incident hyperedge is removed. Therefore, when Algorithm 1 ends, each node has degree at least k, otherwise it must have been included in \({\mathcal {R}}\) and removed, and each hyperedge satisfies the hyperedge-fraction condition, otherwise it must have been removed. This implies that the output of Algorithm 1 satisfies both the node-degree and hyperedge-fraction conditions w.r.t H, k, and t. We now show the maximality. Suppose not, and let \((v, e_i)\) be the first node-hyperedge pair that appears during the process of Algorithm 1 with \(v \in e_i \in E(C_{k, t})\) but \(v \notin e_i' \in E'\), where \(C' = (V', E')\) is the returned hypergraph. This implies that v is removed from e, and thus v is included in \({\mathcal {R}}\) because its degree has been below k. However, by the definition of the (k, t)-hypercore and the assumption that \((v, e_i)\) is the first pair, before the deletion, the degree of v is at least k, which completes the proof by contradiction.
Time complexity. We assume the input hypergraph has been loaded in the memory and thus do not count the complexity of loading the hypergraph. Checking the initial degrees (Line 1) takes \(O(|V|)\). In the while loop, each node is added to the set of nodes to be removed at most once since each node is added exactly when its degree decreases from k to \(k - 1\). Therefore, this process takes \(O(|V|)\). By checking the incident edges of each node in \({\mathcal {R}}\), we find all \(e'_i\)s intersecting with \({\mathcal {R}}\), which takes \(O(|{\mathcal {R}}|) = O(|V|)\). Hash tables are used to implement the sets. Before a hyperedge \(e \in E\) is totally removed, at least \(\max \left( \lceil t|e| \rceil , 2\right)\) nodes remain in it (otherwise it has been removed earlier), and thus it can be visited at most \(|e| - \max \left( \lceil t|e| \rceil , 2\right) + 1\) times (because one node is removed at each time). This process takes \(O(\sum _{e \in E} (|e| - \max \left( \lceil t|e| \rceil , 2\right) + 1)) = O(|E| + (1 - t) \sum _{e \in E} |e|)\). Therefore, the total time complexity is \(O(|V|) + O(|E| + (1 - t) \sum _{e \in E} |e|) = O(|E| + (1 - t) \sum _{e \in E} |e|)\). \(\square\)
1.7 A.7 Proof of theorem 2
Proof
Correctness. For each node v, the assignment of \(c_t(v)\) happens only once when \(v \in {\mathcal {R}}\), i.e., before its deletion. By Theorem 1, \(c_t(v) = k - 1\) implies that v is not in the (k, t)-hypercore but in the previous \((k', t)\)-hypercore where each node has degree at least \(k - 1\), i.e., v is in the \((k-1, t)\)-hypercore.
Time complexity. The values of k increases \(O(c_t^*)\) times, thus the process in Lines 4 and 5 is repeated for \(O(c_t^*)\) times and takes \(O(c_t^* |V|)\). The assignment of t-hypercoreness of each node (Line 7) takes \(O(|V|)\). As shown in the proof of Theorem 1, each hyperedge is visited at most \(|e| - \max (\lceil t|e| \rceil , 2) + 1\) times before being deleted and each node is added to the set of nodes to be removed only once. Therefore, the remaining process takes \(O(|V| + |E| + (1 - t) \sum _{e \in E} |e|)\). \(\square\)
1.8 A.8 Proof of theorem 3
Proof
Correctness. For each node v, the assignment of \(f_k(v)\) happens only once when \(v \in {\mathcal {R}}\), i.e., before its deletion. By Theorem 1, \(f_k(v) = t\) implies that v is in the (k, t)-hypercore with degree k and is in at least one hyperedge that is in the (k, t)-hypercore but not in any \((k, t')\)-hypercore with \(t' > t\). Therefore, v is not in any \((k, t')\)-hypercore with \(t' > t\).
Time complexity. Recording the hyperedge sizes (Line 1) takes \(O(|E|)\). By Theorem 1, computing \(C_{k, 0}\) (Line 2) takes \(O(\sum _{e \in E} |e|)\). As shown in the previous proofs, the while loop (Lines 4 to 12) takes \(O(|V| + |E| + \sum _{e \in E} |e|) = O(\sum _{e \in E} |e|)\). \(\square\)
Details of datasets
In this section, we provide more details of the datasets used in our experiments.
-
Coauth-DBLP/Geology. In these two coauthorship hypergraphs, each hyperedge represents a publication, and the constituent nodes of a hyperedge represent the authors of the corresponding publication.
-
NDC-classes/substances. In these two hypergraphs from the National Drug Code (NDC) Directory, each hyperedge represents a drug (with its unique NDC code), and the constituent nodes of a hyperedge represent the class labels (for NDC-classes) or the ingredients (for NDC-substances) of the drug.
-
Contact-high/primary. In these two contact hypergraphs, each hyperedge represents a group of interacting individuals (the constituent nodes) within a predetermined time period.
-
Email-Enron/Eu. In these two email hypergraphs, each hyperedge represents an email (possibly sent to multiple people individually at the same time), which contains the sender and all the receivers as its constituent nodes.
-
Tags-ubuntu/math/SO. In these three tags hypergraphs from https://stackoverflow.com/, each node represents a tag, and each hyperedge represents a question, where each constituent node represents a tag applied to the question.
-
Threads-ubuntu/math/SO. In these three threads hypergraphs also from https://stackoverflow.com/, each hyperedge represents a thread, where each constituent node represents a person that participates in it.
We have used the preprocessed version of the datasets where each hyperedge consists of at most 25 nodes. In Table 5, we report the number of hyperedges of different cardinality in each dataset. Notably, on https://www.cs.cornell.edu/~arb/data/, the full version of the datasets, in which the cardinality of the hyperedges is not limited, is also available.
Additional experimental results
In this section, we provide additional experimental results supplementing the main text. In Fig. 15, we report the results regarding the statistical difference between t-hypercoreness and other centrality measures, as well as among t-hypercoreness with different t, on the datasets not covered in the main text. In Fig. 16, we report the results regarding the information gain over degree, on the datasets not covered in the main text. In Table 6, for each dataset, we report the information gain over degree for the following quantities: \(\ell\)-hypercoreness with \(\ell \in \{3, 4, 5\}\),Footnote 9 neighbor-hypercoreness, and neighbor-degree-hypercoreness (Definitions. 9, 12, and 14). Notably, we do not claim that higher information gain is always better, since degree is still a reason measure by cohesiveness, and being too different from degrees can be negative as a cohesiveness measure. In Fig. 17, we report the results on influential-node identification, on the datasets not covered in the main text. In Table 7, we report the full results of the collapsed (k, t)-hypercore problem.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bu, F., Lee, G. & Shin, K. Hypercore decomposition for non-fragile hyperedges: concepts, algorithms, observations, and applications. Data Min Knowl Disc 37, 2389–2437 (2023). https://doi.org/10.1007/s10618-023-00956-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-023-00956-2