Abstract
Many problems can benefit from being phrased as a graph processing or graph analytics problem: infectious disease modeling, insider threat detection, fraud prevention, social network analyis, and more. These problems all share a common property: the relationships between entitites in these systems are crucial to understanding the overall behavior of the systems themselves. However, relations are rarely if ever static. As our ability to collect information on those relations improve (e.g. on financial transactions in fraud prevention), the value added by large-scale, high-performance, dynamic/streaming (rather than static) graph analysis becomes significant.
This paper introduces HOOVER, a distributed software framework for large-scale, dynamic graph modeling and analyis. HOOVER sits on top of OpenSHMEM, a PGAS programming system, and enables users to plug in application-specific logic while handling all runtime coordination of computation and communication. HOOVER has demonstrated scaling out to 24,576 cores, and is flexible enough to support a wide range of graph-based applications, including infectious disease modeling and anomaly detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Committee Data Eng. 36(4) (2015)
Chapman, B., et al.: Introducing OpenSHMEM: SHMEM for the PGAS community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, p. 2. ACM (2010)
Eberle, W., Holder, L.: Scalable anomaly detection in graphs. Intell. Data Anal. 19(1), 57–74 (2015)
Eberle, W., Holder, L.B.: Mining for structural anomalies in graph-based data. In: DMIN, pp. 376–389 (2007)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: graph processing in a distributed dataflow framework. In: OSDI, vol. 14, pp. 599–613 (2014)
Hoque, I., Gupta, I.: LFGraph: simple and fast distributed graph analytics. In: Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems, p. 9. ACM (2013)
Kalavri, V.: Gelly: Large-Scale Graph Processing with Apache Flink (2015). https://www.slideshare.net/vkalavri/gelly-in-apache-flink-bay-area-meetup
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM (2010)
Acknowledgments
The authors would like to thank Steve Poole (LANL) for his valuable feedback on the HOOVER project and this manuscript.
Work on HOOVER was funded in part by the United States Department of Defense, and was supported by resources at Los Alamos National Laboratory. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Los Alamos National Laboratory publication number LA-UR-18-27825.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Grossman, M., Pritchard, H., Curtis, T., Sarkar, V. (2019). HOOVER: Distributed, Flexible, and Scalable Streaming Graph Processing on OpenSHMEM. In: Pophale, S., Imam, N., Aderholdt, F., Gorentla Venkata, M. (eds) OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity. OpenSHMEM 2018. Lecture Notes in Computer Science(), vol 11283. Springer, Cham. https://doi.org/10.1007/978-3-030-04918-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-04918-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04917-1
Online ISBN: 978-3-030-04918-8
eBook Packages: Computer ScienceComputer Science (R0)