Abstract
Graphs are widely used to model data in many application domains. Thanks to the wide spread use of GPS-enabled devices, many applications assign spatial attributes to graph vertexes (e.g., geographic knowledge bases, geo-tagged social media). Graph database systems such as Neo4j and Titan are commonly used to manage and query graph data. Even though an off-the-shelf graph database system allows users to define spatial location semantics on vertexes and edges, existing graph query processors are not natively optimized for spatial predicates. The paper proposes GeoExpand—a query operator that adds spatial data awareness to a graph database management system. GeoExpand allows efficient execution of graph queries that involve spatial predicates. The proposed operator makes use of spatial bitmap entries stored as properties in the graph. GeoExpand leverages such bitmap entries to possibly terminate the graph traversal process early and hence reduces the query latency. Since the spatial bitmap entries are represented using a light-weight data structure, they do not add a lot of storage or maintenance overhead. That makes GeoExpand a practical solution. Experiments based on implementation inside the core kernel of Neo4j prove that the GeoExpand operator exhibits up to two orders of magnitude better query response time than the classic Expand operator used in Neo4j.
Similar content being viewed by others
References
Armenatzoglou N, Papadopoulos S, Papadias D (2013) A general framework for geo-social query processing. Proc VLDB Endow 6(10):913–924. https://doi.org/10.14778/2536206.2536218. http://www.vldb.org/pvldb/vol6/p913-papadopoulos.pdf
Bakalov P, Hoel EG, Kim S (2017) A network model for the utility domain. In: Proceedings of the ACM SIGSPATIAL international conference on advances in geographic information systems. https://doi.org/10.1145/3139958.3139980. http://doi.acm.org/10.1145/3139958.3139980, pp 32:1–32:10
Bao J, Mokbel MF, Chow C (2012) Geofeed: a location aware news feed system. In: Proceedings of the IEEE international conference on data engineering, ICDE. https://doi.org/10.1109/ICDE.2012.97, pp 54–65
Beckmann N, Kriegel H, Schneider R, Seeger B (1990) The r*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the ACM international conference on management of data, SIGMOD. https://doi.org/10.1145/93597.98741. http://doi.acm.org/10.1145/93597.98741, pp 322–331
Doytsher Y, Galon B, Kanza Y (2010) Querying geo-social data by bridging spatial networks and social networks. In: Proceedings of international workshop on location based social networks, LBSN. https://doi.org/10.1145/1867699.1867707. http://doi.acm.org/10.1145/1867699.1867707, pp 39–46
Doytsher Y, Galon B, Kanza Y (2012) Querying socio-spatial networks on the world-wide web. In: International World Wide Web conference. https://doi.org/10.1145/2187980.2188041. http://doi.acm.org/10.1145/2187980.2188041, pp 329–332
Finkel RA, Bentley JL (1974) Quad trees: a data structure for retrieval on composite keys. Acta Inform 4:1–9. https://doi.org/10.1007/BF00288933
GeoSPARQL (2017) http://www.opengeospatial.org/standards/geosparql
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM international conference on management of data, SIGMOD. https://doi.org/10.1145/602259.602266. http://doi.acm.org/10.1145/602259.602266, pp 47–57
He H, Singh AK (2008) Graphs-at-a-time: query language and access methods for graph databases. In: Proceedings of the ACM international conference on management of data, SIGMOD. https://doi.org/10.1145/1376616.1376660. http://doi.acm.org/10.1145/1376616.1376660, pp 405–418
Liagouris J, Mamoulis N, Bouros P, Terrovitis M (2014) An effective encoding scheme for spatial RDF data. Proc VLDB Endow 7(12):1271–1282. https://doi.org/10.14778/2732977.2733000. http://www.vldb.org/pvldb/vol7/p1271-liagouris.pdf
Lomet DB (1991) Grow and post index trees: roles, techniques and future potential. In: Advances in spatial databases, second international symposium, SSD’91, Zürich, Switzerland, August 28–30, 1991, Proceedings. https://doi.org/10.1007/3-540-54414-3_38, pp 183–206
Mouratidis K, Li J, Tang Y, Mamoulis N (2015) Joint search by social and spatial proximity. IEEE Trans Knowl Data Eng TKDE 27(3):781–793. https://doi.org/10.1109/TKDE.2014.2339838
Samet H (2006) Foundations of multidimensional and metric data structures. Morgan Kaufmann, San Francisco
Sarwat M (2015) Interactive and scalable exploration of big spatial data—a data management perspective. In: Proceedings of the international conference on mobile data management, MDM. https://doi.org/10.1109/MDM.2015.67, pp 263–270
Sarwat M, Elnikety S, He Y, Mokbel MF (2013) Horton+: a distributed system for processing declarative reachability queries over partitioned graphs. Proc VLDB Endow 6 (14):1918–1929. https://doi.org/10.14778/2556549.2556573. http://www.vldb.org/pvldb/vol6/p1918-sarwat.pdf
Sarwat M, Levandoski JJ, Eldawy A, Mokbel MF (2014) Lars*: an efficient and scalable location-aware recommender system. IEEE Trans Knowl Data Eng TKDE 26(6):1384–1399. https://doi.org/10.1109/TKDE.2013.29
Sarwat M, Bao J, Chow C, Levandoski JJ, Magdy A, Mokbel MF (2015) Context awareness in mobile systems. In: Data management in pervasive systems. https://doi.org/10.1007/978-3-319-20062-0, pp 257–287
Shekhar S, Chawla S (2003) Spatial databases: a tour. Prentice Hall, Upper Saddle River
Shi J, Mamoulis N, Wu D, Cheung DW (2014) Density-based place clustering in geo-social networks. In: Proceedings of the ACM international conference on management of data, SIGMOD. https://doi.org/10.1145/2588555.2610497. http://doi.acm.org/10.1145/2588555.2610497, pp 99–110
Sun Y, Pasumarthy N, Sarwat M (2017) On evaluating social proximity-aware spatial range queries. In: Proceedings of the international conference on mobile data management, MDM. https://doi.org/10.1109/MDM.2017.20, pp 72–81
Wikipedia contributors (2018) Cypher query language—Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Cypher_Query_Language&oldid=849927532
Wood PT (2012) Query languages for graph databases. SIGMOD Record 41(1):50–60. https://doi.org/10.1145/2206869.2206879
Zhao P, Han J (2010) On graph query optimization in large networks. Proc VLDB Endow 3(1):340–351. https://doi.org/10.14778/1920841.1920887. http://www.comp.nus.edu.sg/%7Evldb2010/proceedings/files/papers/R30.pdf
Zhao P, Aggarwal CC, Wang M (2011) gsketch: on query estimation in graph streams. Proc VLDB Endow 5(3):193–204. https://doi.org/10.14778/2078331.2078335. http://www.vldb.org/pvldb/vol5/p193_peixiangzhao_vldb2012.pdf
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work is supported in part by the National Science Foundation (NSF) under Grant 1654861, the Salt River Project Agricultural Improvement and Power District (SRP), and the DOD-ARMY Training and Doctrine Command (TRADOC).
Rights and permissions
About this article
Cite this article
Sun, Y., Sarwat, M. A spatially-pruned vertex expansion operator in the Neo4j graph database system. Geoinformatica 23, 397–423 (2019). https://doi.org/10.1007/s10707-019-00361-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-019-00361-2