Abstract
This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper.
Similar content being viewed by others
References
Alexa, M. and L. Rostek. "Pattern Concordances-TATOE Calls XGrammar." ACH-ALLC '97 Conference Abstracts. Queens University, Kingston, Canada, June 3-7, 1997, pp. 3–
Alexa, M. and L. Rostek. "Computer-Assisted, Corpus-Based Analysis Text with TATOE." ALLCACH96, Book of Abstracts. Bergen, Norway, 1996, pp. 11–17.
Chen, Hsin-Hsi and J.-L. Lee. "Identification and Classification of Proper Nouns in Chinese Texts." Proceedings of COLING-96, Vol. 1. Copenhagen, Denmark, 1996, pp. 222–229.
Flanders, J., S. Bauman, P. Caton, M. Cournane, W. McCarty and J. Bradley. "Applying the TEI: Problems in the Classification of Proper Nouns." ACH-ALLC Conference Abstracts. Queens University, Kingston, Canada, June 3-7, 1997, pp. 53–58.
Hockey, S., T. Butler, S. Brown and S. Fischer. "The Orlando Project: Humanities Computing in Conversation with Literary History." ACH-ALLC Conference Abstracts. Queens University, Kingston, Canada, June 3-7, 1997, pp. 83–89.
Kitani, T. and T. Mitamura. "An Accurate Morphological Analysis and Proper Noun Identification for Japanese Text Processing." Transactions of Information Processing Society of Japan, 35(3) (1994), 404–413.
Lingsoft-GERTWOL. German Morphological Analyzer, available from Lingsoft. Finland, 1996. {urhttp://www.lingsoft.fi}.
Mani, I. and R. T. MacMillan. "Identifying Unknown Proper Names in Newswire Text." In Corpus Processing for Lexical Acquisition. Ed. B. Boguraev and J. Pustejovsky. MIT Press, MA, 1996, pp. 41–59.
McCarty, W. "Encoding Persons and Places in the Metamorphoses of Ovid. Part 1: Engineering the Text" (published 1994), Texte(13/14) (1993), 121–172.
McCarty, W. "Peering Through the Skylight. Part 2: Towards an Electronic Edition of Ovid's Metamorphoses" (published 1995), Texte(15/16) (1994), 261–305.
McDonald, D. "Internal and External Evidence in the Identification and Semantic Categorization of Proper Names." In Corpus Processing for Lexical Acquisition. Ed. B. Boguraev and J. Pustejovsky. MIT Press, MA, 1996, pp. 21–39
Paik, W., E. D. Liddy, E. Yu and M. McKenna. "Categorizing and Standardizing Proper Nouns for Efficient Information Retrieval." In Corpus Processing for Lexical Acquisition. Ed. B. Boguraev and J. Pustejovsky. MIT Press, MA, 1996, pp. 61–73.
Rostek, L., W. Moehr and D. Fischer. "Weaving a Web: The Structure and Creation of an Object Network Representing an Electronic Reference Work." Electronic Publishing, 6(4) (1993), 495–505.
Wakao, T., R. Gaizauskas and Y.Wilks. "Evaluation of an Algorithm for the Recognition and Classi-fication of Proper Nouns." Proceedings of COLING-96, Vol. 1. Copenhagen, Denmark, 1996, pp. 418–423.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rostek, L., Alexa, M. Marking up in TATOE and exporting to SGML. Computers and the Humanities 31, 311–326 (1997). https://doi.org/10.1023/A:1001070608920
Issue Date:
DOI: https://doi.org/10.1023/A:1001070608920