Definition
While being eminently useful in a wide variety of application domains, the high expressiveness of graph queries makes them hard to optimize and, hence, challenging to process efficiently. We discuss a number of state-of-the-art approaches which aim to overcome these challenges, focusing specifically on planning, optimization, and execution of two commonly used types of declarative graph queries: subgraph queries and regular path queries.
Overview
In this chapter, we give a broad overview of the general techniques for processing two classes of graph queries: subgraph queries, which find instances of a query subgraph in a larger input graph, and regular path queries (RPQs), which find pairs of nodes in the graph connected by a path matching a given regular expression. Certainly, software that manage and process graph-structured data, such as RDF triple stores (Neumann and Weikum (2010), Virtuoso), graph database management systems (GDBMSes) (Kankanamge et al. (2017), neo4j),...
Notes
- 1.
The only exception we are aware of is the LogicBlox system (Aref et al. 2015)
- 2.
This can be achieved, for example, by first checking the sizes of the adjacency lists that need to be intersected and starting intersecting the elements in the smallest list in others.
References
Aberger CR, Tu S, Olukotun K, Ré C (2016) EmptyHeaded: a relational engine for graph processing. In: SIGMOD
Agrawal R (1988) Alpha: an extension of relational algebra to express a class of recursive queries. IEEE TSE 14(7):879–885
Aref M, ten Cate B, Green TJ, Kimelfeld B, Olteanu D, Pasalic E, Veldhuizen TL, Washburn G (2015) Design and implementation of the logicblox system. In: SIGMOD
Atserias A, Grohe M, Marx D (2013) Size bounds and query plans for relational joins. SIAM J Comput 42(4):1737–1767
Consens MP, Mendelzon AO, Vista D, Wood PT (1995) Constant propagation versus join reordering in datalog. In: Rules in database systems. Springer, Berlin, Heidelberg, pp 245–259
Cordella L, Foggia P, Sansone C, Vento M (2004) A (sub)graph isomorphism algorithm for matching large graphs. TPAMI 26(10):1367–1372
Dey S, Cuevas-Vicenttín V, Köhler S, Gribkoff E, Wang M, Ludäscher B (2013) On implementing provenance-aware regular path queries with relational query engines. In: Proceedings of the joint EDBT/ICDT 2013 workshops. ACM, pp 214–223
Färber F, Cha SK, Primsch J, Bornhövd C, Sigg S, Lehner W (2012) SAP HANA database: data management for modern business applications. SIGMOD Rec 40(4): 45–51
Fletcher GH, Peters J, Poulovassilis A (2016) Efficient regular path query evaluation using path indexes. In: Proceedings of the 19th international conference on extending database technology
Gou G, Chirkova R (2007) Efficiently querying large XML data repositories: a survey. TKDE 19(10): 1381–1403
Gubichev A, Bedathur SJ, Seufert S (2013) Sparqling Kleene: fast property paths in RDF-3X. In: Workshop on graph data management experiences and systems. ACM, pp 14–20
Han WS, Lee J, Lee JH (2013) Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: SIGMOD
Kankanamge C, Sahu S, Mhedbhi A, Chen J, Salihoglu S (2017) Graphflow: an active graph database. In: SIGMOD
Kochut KJ, Janik M (2007) SPARQLeR: extended SPARQL for semantic association discovery. In: The semantic web: research and applications. Springer, Berlin, pp 145–159
Koschmieder A, Leser U (2012) Regular path queries on large graphs. In: Scientific and statistical database management. Springer, Berlin/Heidelberg, pp 177–194
Lai L, Qin L, Lin X, Zhang Y, Chang L (2016) Scalable distributed subgraph enumeration. In: VLDB
Lai L, Qin L, Lin X, Chang L (2017) Scalable subgraph enumeration in MapReduce: a cost-oriented approach. VLDB J 26(3):421–446
Losemann K, Martens W (2012) The complexity of evaluating path expressions in SPARQL. In: Proceedings of the 31st symposium on principles of database systems. ACM, pp 101–112
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: SIGMOD
Mendelzon A, Wood P (1995) Finding regular simple paths in graph databases. SIAM J Comput 24(6): 1235–1258
Neumann T, Weikum G (2010) The RDF-3X engine for scalable management of RDF data. VLDB 19:91–113
Ngo HQ, Porat E, Ré C, Rudra A (2012) Worst-case optimal join algorithms. In: PODS
Nguyen D, Aref M, Bravenboer M, Kollias G, Ngo HQ, Ré C, Rudra A (2015) Join processing for graph patterns: an old dog with new tricks. In: GRADES workshop
Olteanu D, Schleich M (2016) Factorized databases. SIGMOD Rec 45(2):5–16
Pérez J, Arenas M, Gutierrez C (2010) nSPARQL: a navigational language for RDF. Web Semant Sci Serv Agents World Wide Web 8(4):255–270
Qiao M, Zhang H, Cheng H (2017) Subgraph matching: on compression and computation. VLDB 11(2):17–188
Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422
Veldhuizen TL (2012) Leapfrog Triejoin: a worst-case optimal join algorithm. CoRR abs/1210.0481
Yakovets N, Godfrey P, Gryz J (2013) Evaluation of SPARQL property paths via recursive SQL. In: AMW
Yakovets N, Godfrey P, Gryz J (2016) Query planning for evaluating SPARQL property paths. In: SIGMOD, San Francisco, pp 1875–1889
Zauner H, Linse B, Furche T, Bry F (2010) A RPL through RDF: expressive navigation in RDF graphs. In: Web reasoning and rule systems. Springer, Heidelberg, pp 251–257
Zhao P, Han J (2010) On graph query optimization in large networks. VLDB 3(1–2):340–351
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this entry
Cite this entry
Salihoglu, S., Yakovets, N. (2018). Graph Query Processing. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_215-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_215-1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Living Reference MathematicsReference Module Computer Science and Engineering