Abstract
Twig pattern matching (TPM) is the core operation of XML query processing. Existing approaches rely on either efficient data structures or novel labeling/indexing schemes to reduce the intermediate result size, but none of them takes into account the rich semantic information resided in XML document and the query issued. Moreover, in order to fulfill the semantics of the XPath/XQuery query, most of them require costly post processing to eliminate redundant matches and group matching results. In this paper, we propose an innovative semantics-aware query optimization approach to overcome these limitations. In particular, we exploit the functional dependency derived from the given semantic information to stop query processing early; we distinguish the output and predicate nodes of a query, then propose a query breakup technique and build a query plan, such that for each distinct query output, we avoid finding the redundant matches having the same results as the first match in most cases. Both I/O and structural join cost are saved, and much less intermediate results are produced. Experiments show the effectiveness of our optimization.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Al-Khalifa, S., Jagadish, H.V., Patel, J., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient Xml query pattern matching. In: ICDE, pp. 141–152 (2002)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal Xml pattern matching. In: SIGMOD, pp. 310–321 (2002)
Chen, S., Li, H., Tatemura, J., Hsiung, W., Agrawal, D., Candan, K.S.: Twig\(^{\mbox{2}}\)stack: Bottom-up processing of generalized-tree-pattern queries over Xml documents. In: VLDB, pp. 283–294 (2006)
Chen, T., Lu, J., Ling, T.W.: On boosting holism in Xml twig pattern matching using structural indexing techniques. In: SIGMOD, pp. 455–466 (2005)
Chippimolchai, P., Wuwongse, V., Anutariya, C.: Semantic query formulation and evaluation for Xml databases. In: WISE, pp. 205–214 (2002)
Chippimolchai, P., Wuwongse, V., Anutariya, C.: Towards semantic query optimization for Xml databases. In: ICDE Workshops (2005)
Clark, J., De Rose, S.: Xml path language xpath version 1.0 (1999)
Georgiadis, H., Vassalos, V.: Xpath on steroids: exploiting relational engines for xpath performance. In: SIGMOD, pp. 317–328 (2007)
Grust, T., Van Keulen, M., Teubner, J.: Accelerating xpath evaluation in any rdbms. ACM Trans. Database Syst. 29(1), 91–131 (2004)
Jiang, H., Wang, W., Lu, H., Yu, J.: Holistic twig joins on indexed Xml documents. In: Aberer, K., Koubarakis, M., Kalogeraki, V. (eds.) VLDB 2003. LNCS, vol. 2944, pp. 273–284. Springer, Heidelberg (2004)
Lu, J., Chen, T., Ling, T.: Efficient processing of xml twig patterns with parent child edges: A look-ahead approach (2004)
Lu, J., Ling, T., Chan, C., Chen, T.: From region encoding to extended dewey: On efficient processing of twig pattern matching. In: VLDB, pp. 193–204 (2005)
Florescu, D., Boag, S., Chamberlin, D., Robie, J.: Xquery 1.0: An Xml query language (2007)
Wu, X., Ling, T.W., Lee, M.-L., Dobbie, G.: Designing semistructured databases using ora-ss model. In: WISE, pp. 171–180 (2001)
Zhang, C., Naughton, J., DeWitt, J., Lohman, Q.L.a.: On supporting containment queries in relational database management systems. In: SIGMOD, pp. 425–436 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bao, Z., Ling, T.W., Lu, J., Chen, B. (2008). SemanticTwig: A Semantic Approach to Optimize XML Query Processing. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-78568-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78567-5
Online ISBN: 978-3-540-78568-2
eBook Packages: Computer ScienceComputer Science (R0)