Abstract
Graph is a ubiquitous representation of data in various research fields, and graph embedding is a prevalent machine learning technique for capturing key features and generating fixed-sized attributes. However, most state-of-the-art graph embedding methods are computationally and spatially expensive. Recently, the Graph Encoder Embedding (GEE) is shown as the fastest graph embedding technique and is suitable for a variety of network data applications. As real-world data often involves large and sparse graphs, the huge sparsity usually results in redundant computations and storage. To address this issue, we propose an improved version of GEE, sparse GEE, which optimizes the calculation and storage of zero entries in sparse matrices to enhance the running time further. Our experiments demonstrate that the sparse version achieves significant speedup compared to the original GEE with Python implementation for large sparse graphs, and sparse GEE is capable of processing millions of edges within minutes on a standard laptop.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yue, X., et al.: Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics 36(4), 1241–1251 (2020)
Kumar, S., Mallik, A., Khetarpal, A., Panda, B.S.: Influence maximization in social networks using graph embedding and graph neural network. Inf. Sci. 607, 1617–1636 (2022)
Pornprasit, C., et al.: Enhancing citation recommendation using citation network embedding. Scientometrics, 1–32 (2022)
Wu, S., Sun, F., Zhang, W., Xie, X., Cui, B.: Graph neural networks in recommender systems: a survey. ACM Comput. Surv. 55(5), 1–37 (2022)
Xu, M.: Understanding graph embedding methods and their applications. SIAM Rev. 63(4), 825–853 (2021)
Cai, H., Zheng, V.W., Chang, K.C.C.: A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans. Knowl. Data Eng. 30(9), 1616–1637 (2018)
Rohe, K., Chatterjee, S., Yu, B.: Spectral clustering and the high-dimensional stochastic blockmodel, pp. 1878–1915 (2011)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on KNOWLEDGE DIscovery and Data Mining, pp. 855–864 (2016)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Shen, C., Wang, Q., Priebe, C.E.: One-hot graph encoder embedding. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7933–7938 (2023)
Shen, C., Park, Y., and Priebe, C. E.: Graph encoder ensemble for simultaneous vertex embedding and community detection. arXiv preprint arXiv:2301.11290 (2023)
Shen, C., Larson, J., Trinh, H., Qin, X., Park, Y., Priebe, C.E.: Discovering communication pattern shifts in large-scale networks using encoder embedding and vertex dynamics. arXiv preprint arXiv:2109.13098 (2023)
Shen, C., Priebe C. E., Larson J., and Trinh H.: Synergistic Graph Fusion via Encoder Embedding. arXiv preprint arXiv:2303.18051 (2023)
Strang, G.: Introduction to Linear Algebra, 5th edn. Wellesley - Cambridge Press (2016)
Duff, L.S., Erisman, A.M., Reid, J.K.: Direct Methods for Sparse Matrices (Numerical Mathematics and Scientific Computation), 2nd edn. Oxford University Press, Oxford (2017)
Kelly, T.: Programming workbench: compressed sparse row format for representing graphs. Login Usenix Mag. 45(4) (2020)
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17(3), 261–272 (2020)
Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983)
Gao, C., Ma, Z., Zhang, A.Y., Zhou, H.H.: Achieving optimal misclassification proportion in stochastic block models. J. Mach. Learn. Res. 18(1), 1980–2024 (2017)
Rossi, R., Ahmed, N.: The network data repository with interactive graph analytics and visualization. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Acknowledgment
This work was supported in part by the National Science Foundation DMS-2113099, and by funding from Microsoft Research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Qin, X., Shen, C. (2024). Efficient Graph Encoder Embedding for Large Sparse Graphs in Python. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1018. Springer, Cham. https://doi.org/10.1007/978-3-031-62269-4_36
Download citation
DOI: https://doi.org/10.1007/978-3-031-62269-4_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62268-7
Online ISBN: 978-3-031-62269-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)