Abstract
Although the native (tree-like) storage of XML data becomes more and more important there will be an enduring demand to manage XML data in its textual representation, for instance in relational structures or file systems. XML data has to be wellformed by definition and additionally, in many cases, it has to be valid according to a given XML schema. Because the XML column types are often derived from text types (e.g. CLOBs) guaranteeing well-formedness as well as validity is not trivial. And even worse, for frequently modified data it is usually too expensive to re-validate the whole XML data after each update – but waiving re-validation may lead to inconsistencies and malfunctions of applications. In this paper we present a schema-aware pushdown automaton (i.e. a stack machine) that validates an XML string/stream. Using an element/state-index, the pushdown automaton is able to re-validate local modifications of the data while guaranteeing overall validity. Update operations (e.g. SQLXML, XQuery updates) are validated before executing them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altova. XMLSpy, URL: http://www.altova.com
Balmin, A., Papakonstantinou, Y., Vianu, V.: Incremental validation of XML documents. ACM Trans. Database Syst. 29(4), 710–751 (2004)
Barbosa, D., Mendelzon, A.O., Libkin, L., Mignet, L., Arenas, M.: Efficient Incremental Validation of XML Documents. In: ICDE 2004. Proceedings of the 20th International Conference on Data Engineering, Washington, DC, USA, pp. 671–682. IEEE Computer Society Press, Los Alamitos (2004)
Beyer, K., Cochrane, R., Josifovski, V., Kleewein, J., Lapis, G., Lohman, G., Lyle, B., Özcan, F., Pirahesh, H., Seemann, N., Truong, T.: System RX: One Part Relational, One Part XML. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16 2005, pp. 347–358. ACM Press, New York (2005)
Bouchou, B., Alves, M.H.F.: Updates and Incremental Validation of XML Documents. In: DBPL, pp. 216–232 (2003)
Bouchou, B., Alves, M.H.F., Laurent, D., Duarte, D.: Extending Tree Automata to Model XML Validation Under Element and Attribute Constraints. In: ICEIS (1), pp. 184–190 (2003)
Brüggemann-Klein, A., Wood, D.: Balanced context-free grammars, hedge grammars and pushdown caterpillar automata. In: Extreme Markup Languages (2004)
Chitic, C., Rosu, D.: On validation of XML streams using finite state machines. In: WebDB 2004. Proceedings of the 7th International Workshop on the Web and Databases, pp. 85–90. ACM Press, New York, NY, USA (2004)
Megginson, D.: Simple API for XML, URL: http://www.saxproject.org/
Fiebig, T., Helmer, S., Kanne, C.-C., Moerkotte, G., Neumann, J., Schiele, R., Westmann, T.: Anatomy of a native XML base management system. VLDB Journal 11(4), 292–314 (2002)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, pp. 436–445 (1997)
Grust, T., Klinger, S.: Schema validation and type annotation for encoded trees. In: Proceedings of the First International Workshop on XQuery Implementation (XIME-P), Paris, France, June 2004, pp. 55–60 (2004)
Hammerschmidt, B.C.: KeyX: Selective Key-Oriented Indexing in Native XML-Databases. Dissertation zum Dr.-Ing., Institut für Informationssysteme, Technisch-Naturwissenschaftliche Fakultät, Universität zu Lübeck, October, DISDBIS 93, Akademische Verlagsgesellschaft Aka GmbH, Berlin 2006, ISBN 3-89838-493-4 (2005)
Hammerschmidt, B.C., Kempa, M., Linnemann, V.: A selective key-oriented XML Index for the Index Selection Problem in XDBMS. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, Springer, Heidelberg (2004)
Hammerschmidt, B.C., Kempa, M., Linnemann, V.: Autonomous Index Optimization in XML Databases. In: Proceedings of the International Workshop on Self-Managing Database Systems (SMDB 2005), Tokyo, Japan, April 8-9 2005, pp. 56–65 (2005)
Hammerschmidt, B.C., Kempa, M., Linnemann, V.: On the Intersection of XPath Expressions. In: Proceedings of the 9th International Database Engineering & Application Symposium (IDEAS 2005), Montreal, Canada, July 25-27, 2005 (2005)
Hammerschmidt, B.C., Linnemann, V.: The Index Update Problem for XML Data in XDBMS. In: Proceedings of the 7th International Conference on Enterprise Information Systems (ICEIS 2005), Miami, USA, pp. 27–34 (2005)
Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison Wesley Publishing Company, Reading (2001)
Hunter, J., McLaughlin.: JDOM 1.0, URL: http://www.jdom.org/
Sang-Kyun, K., Myungcheol, L., Kyu-Chul, L.: Immediate and Partial Validation Mechanism for the Conflict Resolution of Update Operations in XML Databases. In: Meng, X., Su, J., Wang, Y. (eds.) WAIM 2002. LNCS, vol. 2419, pp. 387–396. Springer, Heidelberg (2002)
Sang-Kyun, K., Myungcheol, L., Kyu-Chul, L.: Validation of XML Document Updates Based on XML Schema in XML Databases. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 98–108. Springer, Heidelberg (2003)
Liu, Z.H., Krishnaprasad, M., Arora, V.: Native Xquery processing in Oracle XMLDB. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16 2005, pp. 828–833. ACM Press, New York (2005)
Miklau, G., Suciu, D.: Containment and equivalence for a fragment of XPath. Journal of the ACM 51(1), 2–45 (2004)
Murata, M., Lee, D., Mani, M., Kawaguchi, K.: Taxonomy of XML schema languages using formal language theory. ACM Trans. Inter. Tech. 5(4) (2005)
Papakonstantinou, Y., Vianu, V.: Incremental Validation of XML Documents. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 47–63. Springer, Heidelberg (2002)
Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A Benchmark for XML Data Management. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), Hong Kong, China, pp. 974–985 (2002)
Schöning, H.: Tamino - A DBMS designed for XML. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, April 2-6, 2001, pp. 149–154. IEEE Computer Society, Los Alamitos (2001)
Segoufin, L.: Typing and querying XML documents: some complexity bounds. In: PODS 2003. Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 167–178. ACM Press, New York (2003)
Segoufin, L., Vianu, V.: Validating streaming XML documents. In: PODS 2002. Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 53–64. ACM Press, New York (2002)
Sun Microsystems, Inc. Trang: Multi-format schema converter based on RELAX NG (May 2006), URL: http://www.thaiopensource.com/relaxng/trang.html
Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N.: XML Schema part 1: Structures 2 edn. W3C Recommendation (October 2004), URL: http://www.w3.org/TR/xmlschema-1
Werner, C., Buschmann, C., Brandt, Y., Fischer, S.: Compressing SOAP Messages by using Pushdown Automata. In: Proceedings of the IEEE International Conference on Web Services, Chicago, USA, September 2006, IEEE Computer Society Press, Los Alamitos (2006)
World Wide Web Consortium (W3C). XQuery Update Facility Requirements (2005), URL: http://www.w3.org/TR/xquery-update-requirements/
World Wide Web Consortium (W3C). XML Schema (2006), URL: http://www.w3.org/XML/Schema
World Wide Web Consortium (W3C). XQuery Update Facility (2006), URL: http://www.w3.org/TR/2006/WD-xqupdate-20060711/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hammerschmidt, B.C., Werner, C., Brandt, Y., Linnemann, V., Groppe, S., Fischer, S. (2007). Incremental Validation of String-Based XML Data in Databases, File Systems, and Streams. In: Ioannidis, Y., Novikov, B., Rachev, B. (eds) Advances in Databases and Information Systems. ADBIS 2007. Lecture Notes in Computer Science, vol 4690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75185-4_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-75185-4_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75184-7
Online ISBN: 978-3-540-75185-4
eBook Packages: Computer ScienceComputer Science (R0)