Abstract
XQuery path queries form the basis of complex matching and processing of XML data. Most current XML query processing techniques can be divided in two groups. Navigation-based algorithms compute results by analyzing an input stream of documents one tag at a time. In contrast, index-based algorithms take advantage of (precomputed or computed-on-demand) numbering schemes over each input XML document in the stream. In this chapter, we present an index-based technique, Index-Filter, to answer multiple path queries. Index-Filter uses indexes built over the document tags to avoid processing large portions of an input document that are guaranteed not to be part of any match. We analyze Index-Filter, compare it against Y-Filter, a state-of-the-art navigation-based technique, and present the advantages of each technique.
Reprinted, with permission, from &qout;Navigation-vs. Index-based XML Multi-query Processing." Nicolas Bruno, Luis Gravano, Nick Koudas and Divesh Srivastava. In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pages 139–150, 2003.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mehmet Altinel and Michael J. Franklin. Efficient filtering of XML documents for selective dissemination of information. In Proceedings of VLDB, 2000.
Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava, and Yuqing Wu. Structural joins: A primitive for efficient XML query pattern matching. In Proceedings of ICDE, 2002.
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, and Jennifer Widom. Models and issues in data stream systems. In Proceedings of ACM PODS, 2002.
Scott Boag, Don Chamberlin, Mary F. Fernandez, Michael Kay, Jonathan Robie, and Jerome Simeon. XML path language (XPath) 2.0. W3C Working Draft. Available from http://www.w3.org/TR/xpath20, July 2004.
Scott Boag, Don Chamberlin, Mary F. Fernandez, Daniela Florescu, Jonathan Robie, and Jerome Simeon. XQuery 1.0: An XML query language. W3C Working Draft. Available from http://www.w3.org/TR/xquery, July 2004.
Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavachari, Marcus Fontoura, and Vanja Josifovski. Streaming XPath processing with forward and backward axes. In Proceedings of ICDE, 2003.
Nicolas Bruno, Nick Koudas, and Divesh Srivastava. Holistic twig joins: Optimal XML pattern matching. In Proceedings of ACM SIGMOD, 2002.
Nicolas Bruno, Luis Gravano, Nick Koudas, and Divesh Srivastava. Navigation-vs. index-based XML multi-query processing. In Proceedings of ICDE, 2003.
Chee Yong Chan, Pascal Felber, Minos Garofalakis, and Rajeev Rastogi. Efficient filtering of XML documents with XPath expressions. In Proceedings of ICDE, 2002.
Mariano P. Consens and Tova Milo. Optimizing queries on files. In Proceedings of ACM SIGMOD, 1994.
Chin-Wan Chung, Jun-Ki Min, and Kyuseok Shim. APEX: An adaptive path index for XML data. In Proceedings of ACM SIGMOD, 2002.
Brian Cooper, Neal Sample, Michael J. Franklin, Gisli R. Hjaltason, and Moshe Shadmon. A fast index for semistructured data. In Proceedings of VLDB, 2001.
Shu-Yao Chien, Zografoula Vagena, Donghui Zhang, Vassilis J. Tsotras, and Carlo Zaniolo. Efficient structural joins on indexed XML documents. In Proceedings of VLDB, 2002.
David DeHaan, David Toman, Mariano P. Consens, and M. Tamer Ozsu. A comprehensive XQuery to SQL translation using dynamic interval encoding. In Proceedings of ACM SIGMOD, 2003.
Yanlei Diao, Mehmet Altinel, Michael J. Franklin, Hao Zhang, and Peter M. Fischer. Path sharing and predicate evaluation for high-performance XML filtering. ACM TODS, 28(4), 2003.
Daniela Florescu and Donald Kossmann. Storing and querying XML data using an RDBMS. IEEE Data Engineering Bulletin, 22(3), 1999.
Thorsten Fiebig and Guido Moerkotte. Evaluating queries on structure with extended access support relations. In Proceedings of WebDB, 2000.
Mary F. Fernandez, Wang Chiew Tan, and Dan Suciu. SilkRoute: Trading between relations and XML. Computer Networks, 33(1–6), 2000.
Torsten Grust. Accelerating XPath location steps. In Proceedings of ACM SIGMOD, 2002.
Todd J. Green, Gerome Miklau, Makoto Onizuka, and Dan Suciu. Processing XML streams with deterministic automata. In Proceedings of ICDT, 2003.
Lukasz Golab and M. Tamer Ozsu. Processing sliding window multi-joins in continuous queries over data streams. In Proceedings of VLDB, 2003.
Ashish Kumar Gupta and Dan Suciu. Stream processing of XPath queries with predicates. In Proceedings of ACM SIGMOD, 2003.
Zachary G. Ives, Alon Y. Halevy, and Daniel S. Weld. An XML query engine for network-bound data. VLDB Journal, 11(4), 2002.
Haifeng Jiang, Hongjun Lu, Wei Wang, Beng Chin Ooi. XR-Tree: Indexing XML data for efficient structural joins. In Proceedings of ICDE, 2003.
Haifeng Jiang, Wei Wang, Hongjun Lu, and Jeffrey Xu Yu. Holistic twig joins on indexed XML documents. In Proceedings of VLDB, 2003.
Jaewoo Kang, Jeffrey F. Naughton, and Stratis Viglas. Evaluating window joins over unbounded streams. In Proceedings of lCDE, 2003.
Nick Koudas and Divesh Srivastava. Data stream query processing: a tutorial. In Proceedings of VLDB, 2003.
Laks V. S. Lakshmanan and Sailaja Parthasarathy. On efficient matching of streaming XML documents and queries. In Proceedings of EDBT, 2002.
Quanzhong Li and Bongki Moon. Indexing and querying XML data for regular path expressions. In Proceedings of VLDB, 2001.
Jason McHugh, Serge Abiteboul, Roy Goldman, Dalian Quass, and Jennifer Widom. Lore: A database management system for semistructured data. SIGMOD Record, 26(3), 1997.
Jeffrey F. Naughton, David J. DeWitt, David Maier, Ashraf Aboulnaga, Jianjun Chen, Leonidas Galanis, Jaewoo Kang, Rajasekar Krishnamurthy, Qiong Luo, Naveen Prakash, Ravishankar Ramamurthy, Jayavel Shanmugasundaram, Feng Tian, Kristin Tufte, Stratis Viglas, Yuan Wang, Chun Zhang, Bruce Jackson, Anurag Gupta, and Rushan Chen. The Niagara Internet query system. IEEE Data Engineering Bulletin, 24(2), 2001.
Feng Peng and Sudarshan S. Chawathe. XPath queries on streaming data. In Proceedings of ACM SIGMOD, 2003.
Joao Pereira, Francoise Fabret, Hans-Arno Jacobsen, Francois Llirbat, and Dennis Shasha. WebFilter: A high throughput XML-based publish and subscribe system. Proceedings of VLDB, 2001.
Praveen Rao and Bongki Moon. PRIX: Indexing and querying XML using Prufer sequences. In Proceedings of ICDE, 2004.
Gerard Salton and Michael J. McGill. Introduction to modern information retrieval. McGraw-Hill, 1983.
Jayavel Shanmugasundaram, Eugene J. Shekita, Rimon Barr, Michael J. Carey, Bruce G. Lindsay, Hamid Pirahesh, and Berthold Reinwald. Efficiently publishing relational data as XML documents. In Proceedings of VLDB, 2000.
Peter A. Tucker, David Maier, Tim Sheard, and Leonidas Fegaras. Exploiting punctuation semantics in continuous data streams. IEEE TKDE, 15(3), 2003.
Stratis Viglas, Jeffrey F. Naughton, and Josef Burger. Maximizing the output rate of multi-way join queries over streaming information sources. In Proceedings of VLDB, 2003.
Haixun Wang, Sanghyun Park, Wei Fan, and Philip S. Yu. ViST: a dynamic index method for querying XML data by tree structures. In Proceedings of ACM SIGMOD, 2003.
Masatoshi Yoshikawa, Toshiyuki Amagasa, Takeyuki Shimura, and Shunsuke Uemura. XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM TOIT, 1(1), 2001.
Chun Zhang, Jeffrey F. Naughton, David J. DeWitt, Qiong Luo, and Guy M. Lohman. On supporting containment queries in relational database management systems. In Proceedings of SIGMOD, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 IEEE
About this chapter
Cite this chapter
Bruno, N., Gravano, L., Koudas, N., Srivastava, D. (2003). XML & Data Streams. In: Chaudhry, N.A., Shaw, K., Abdelguerfi, M. (eds) Stream Data Management. Advances in Database Systems, vol 30. Springer, Boston, MA. https://doi.org/10.1007/0-387-25229-0_4
Download citation
DOI: https://doi.org/10.1007/0-387-25229-0_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-24393-1
Online ISBN: 978-0-387-25229-2
eBook Packages: Computer ScienceComputer Science (R0)