XML & Data Streams | SpringerLink
Skip to main content

XML & Data Streams

  • Chapter
Stream Data Management

Part of the book series: Advances in Database Systems ((ADBS,volume 30))

Abstract

XQuery path queries form the basis of complex matching and processing of XML data. Most current XML query processing techniques can be divided in two groups. Navigation-based algorithms compute results by analyzing an input stream of documents one tag at a time. In contrast, index-based algorithms take advantage of (precomputed or computed-on-demand) numbering schemes over each input XML document in the stream. In this chapter, we present an index-based technique, Index-Filter, to answer multiple path queries. Index-Filter uses indexes built over the document tags to avoid processing large portions of an input document that are guaranteed not to be part of any match. We analyze Index-Filter, compare it against Y-Filter, a state-of-the-art navigation-based technique, and present the advantages of each technique.

Reprinted, with permission, from &qout;Navigation-vs. Index-based XML Multi-query Processing." Nicolas Bruno, Luis Gravano, Nick Koudas and Divesh Srivastava. In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pages 139–150, 2003.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
JPY 14299
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Mehmet Altinel and Michael J. Franklin. Efficient filtering of XML documents for selective dissemination of information. In Proceedings of VLDB, 2000.

    Google Scholar 

  • Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava, and Yuqing Wu. Structural joins: A primitive for efficient XML query pattern matching. In Proceedings of ICDE, 2002.

    Google Scholar 

  • Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, and Jennifer Widom. Models and issues in data stream systems. In Proceedings of ACM PODS, 2002.

    Google Scholar 

  • Scott Boag, Don Chamberlin, Mary F. Fernandez, Michael Kay, Jonathan Robie, and Jerome Simeon. XML path language (XPath) 2.0. W3C Working Draft. Available from http://www.w3.org/TR/xpath20, July 2004.

    Google Scholar 

  • Scott Boag, Don Chamberlin, Mary F. Fernandez, Daniela Florescu, Jonathan Robie, and Jerome Simeon. XQuery 1.0: An XML query language. W3C Working Draft. Available from http://www.w3.org/TR/xquery, July 2004.

    Google Scholar 

  • Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavachari, Marcus Fontoura, and Vanja Josifovski. Streaming XPath processing with forward and backward axes. In Proceedings of ICDE, 2003.

    Google Scholar 

  • Nicolas Bruno, Nick Koudas, and Divesh Srivastava. Holistic twig joins: Optimal XML pattern matching. In Proceedings of ACM SIGMOD, 2002.

    Google Scholar 

  • Nicolas Bruno, Luis Gravano, Nick Koudas, and Divesh Srivastava. Navigation-vs. index-based XML multi-query processing. In Proceedings of ICDE, 2003.

    Google Scholar 

  • Chee Yong Chan, Pascal Felber, Minos Garofalakis, and Rajeev Rastogi. Efficient filtering of XML documents with XPath expressions. In Proceedings of ICDE, 2002.

    Google Scholar 

  • Mariano P. Consens and Tova Milo. Optimizing queries on files. In Proceedings of ACM SIGMOD, 1994.

    Google Scholar 

  • Chin-Wan Chung, Jun-Ki Min, and Kyuseok Shim. APEX: An adaptive path index for XML data. In Proceedings of ACM SIGMOD, 2002.

    Google Scholar 

  • Brian Cooper, Neal Sample, Michael J. Franklin, Gisli R. Hjaltason, and Moshe Shadmon. A fast index for semistructured data. In Proceedings of VLDB, 2001.

    Google Scholar 

  • Shu-Yao Chien, Zografoula Vagena, Donghui Zhang, Vassilis J. Tsotras, and Carlo Zaniolo. Efficient structural joins on indexed XML documents. In Proceedings of VLDB, 2002.

    Google Scholar 

  • David DeHaan, David Toman, Mariano P. Consens, and M. Tamer Ozsu. A comprehensive XQuery to SQL translation using dynamic interval encoding. In Proceedings of ACM SIGMOD, 2003.

    Google Scholar 

  • Yanlei Diao, Mehmet Altinel, Michael J. Franklin, Hao Zhang, and Peter M. Fischer. Path sharing and predicate evaluation for high-performance XML filtering. ACM TODS, 28(4), 2003.

    Google Scholar 

  • Daniela Florescu and Donald Kossmann. Storing and querying XML data using an RDBMS. IEEE Data Engineering Bulletin, 22(3), 1999.

    Google Scholar 

  • Thorsten Fiebig and Guido Moerkotte. Evaluating queries on structure with extended access support relations. In Proceedings of WebDB, 2000.

    Google Scholar 

  • Mary F. Fernandez, Wang Chiew Tan, and Dan Suciu. SilkRoute: Trading between relations and XML. Computer Networks, 33(1–6), 2000.

    Google Scholar 

  • Torsten Grust. Accelerating XPath location steps. In Proceedings of ACM SIGMOD, 2002.

    Google Scholar 

  • Todd J. Green, Gerome Miklau, Makoto Onizuka, and Dan Suciu. Processing XML streams with deterministic automata. In Proceedings of ICDT, 2003.

    Google Scholar 

  • Lukasz Golab and M. Tamer Ozsu. Processing sliding window multi-joins in continuous queries over data streams. In Proceedings of VLDB, 2003.

    Google Scholar 

  • Ashish Kumar Gupta and Dan Suciu. Stream processing of XPath queries with predicates. In Proceedings of ACM SIGMOD, 2003.

    Google Scholar 

  • Zachary G. Ives, Alon Y. Halevy, and Daniel S. Weld. An XML query engine for network-bound data. VLDB Journal, 11(4), 2002.

    Google Scholar 

  • Haifeng Jiang, Hongjun Lu, Wei Wang, Beng Chin Ooi. XR-Tree: Indexing XML data for efficient structural joins. In Proceedings of ICDE, 2003.

    Google Scholar 

  • Haifeng Jiang, Wei Wang, Hongjun Lu, and Jeffrey Xu Yu. Holistic twig joins on indexed XML documents. In Proceedings of VLDB, 2003.

    Google Scholar 

  • Jaewoo Kang, Jeffrey F. Naughton, and Stratis Viglas. Evaluating window joins over unbounded streams. In Proceedings of lCDE, 2003.

    Google Scholar 

  • Nick Koudas and Divesh Srivastava. Data stream query processing: a tutorial. In Proceedings of VLDB, 2003.

    Google Scholar 

  • Laks V. S. Lakshmanan and Sailaja Parthasarathy. On efficient matching of streaming XML documents and queries. In Proceedings of EDBT, 2002.

    Google Scholar 

  • Quanzhong Li and Bongki Moon. Indexing and querying XML data for regular path expressions. In Proceedings of VLDB, 2001.

    Google Scholar 

  • Jason McHugh, Serge Abiteboul, Roy Goldman, Dalian Quass, and Jennifer Widom. Lore: A database management system for semistructured data. SIGMOD Record, 26(3), 1997.

    Google Scholar 

  • Jeffrey F. Naughton, David J. DeWitt, David Maier, Ashraf Aboulnaga, Jianjun Chen, Leonidas Galanis, Jaewoo Kang, Rajasekar Krishnamurthy, Qiong Luo, Naveen Prakash, Ravishankar Ramamurthy, Jayavel Shanmugasundaram, Feng Tian, Kristin Tufte, Stratis Viglas, Yuan Wang, Chun Zhang, Bruce Jackson, Anurag Gupta, and Rushan Chen. The Niagara Internet query system. IEEE Data Engineering Bulletin, 24(2), 2001.

    Google Scholar 

  • Feng Peng and Sudarshan S. Chawathe. XPath queries on streaming data. In Proceedings of ACM SIGMOD, 2003.

    Google Scholar 

  • Joao Pereira, Francoise Fabret, Hans-Arno Jacobsen, Francois Llirbat, and Dennis Shasha. WebFilter: A high throughput XML-based publish and subscribe system. Proceedings of VLDB, 2001.

    Google Scholar 

  • Praveen Rao and Bongki Moon. PRIX: Indexing and querying XML using Prufer sequences. In Proceedings of ICDE, 2004.

    Google Scholar 

  • Gerard Salton and Michael J. McGill. Introduction to modern information retrieval. McGraw-Hill, 1983.

    Google Scholar 

  • Jayavel Shanmugasundaram, Eugene J. Shekita, Rimon Barr, Michael J. Carey, Bruce G. Lindsay, Hamid Pirahesh, and Berthold Reinwald. Efficiently publishing relational data as XML documents. In Proceedings of VLDB, 2000.

    Google Scholar 

  • Peter A. Tucker, David Maier, Tim Sheard, and Leonidas Fegaras. Exploiting punctuation semantics in continuous data streams. IEEE TKDE, 15(3), 2003.

    Google Scholar 

  • Stratis Viglas, Jeffrey F. Naughton, and Josef Burger. Maximizing the output rate of multi-way join queries over streaming information sources. In Proceedings of VLDB, 2003.

    Google Scholar 

  • Haixun Wang, Sanghyun Park, Wei Fan, and Philip S. Yu. ViST: a dynamic index method for querying XML data by tree structures. In Proceedings of ACM SIGMOD, 2003.

    Google Scholar 

  • Masatoshi Yoshikawa, Toshiyuki Amagasa, Takeyuki Shimura, and Shunsuke Uemura. XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM TOIT, 1(1), 2001.

    Google Scholar 

  • Chun Zhang, Jeffrey F. Naughton, David J. DeWitt, Qiong Luo, and Guy M. Lohman. On supporting containment queries in relational database management systems. In Proceedings of SIGMOD, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 IEEE

About this chapter

Cite this chapter

Bruno, N., Gravano, L., Koudas, N., Srivastava, D. (2003). XML & Data Streams. In: Chaudhry, N.A., Shaw, K., Abdelguerfi, M. (eds) Stream Data Management. Advances in Database Systems, vol 30. Springer, Boston, MA. https://doi.org/10.1007/0-387-25229-0_4

Download citation

  • DOI: https://doi.org/10.1007/0-387-25229-0_4

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-24393-1

  • Online ISBN: 978-0-387-25229-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics