Abstract
With the aim of removing the residuary errors made by pure stochastic disambiguation models, we put forward a hybrid system in which linguist users introduce high level contextual rules to be applied in combination with a tagger based on a Hidden Markov Model. The design of these rules is inspired in the Constraint Grammars formalism. In the present work, we review this formalism in order to propose a more intuitive syntax and semantics for rules, and we develop a strategy to compile the rules under the form of Finite State Transducers, thus guaranteeing an efficient execution framework.
This work has been partially supported by the Spanish Government (under projects TIC2000-0370-C02-01 and HP2001-0044), and by the Galician Government (under project PGIDT01PXI10506PN).
Galena means Generation of Natural Language Analyzers and Corga means Reference Corpus of Current Galician. See http://coleweb.dc.fi.udc.es for more information on both projects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brill, E. (1994). Some advances in rule-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94).
Graña, J.; Chappelier, J.-C.; Vilares, M. (2001). Integrating external dictionaries into part-of-speech taggers. In Proc. of the Euroconference on Recent Advances in Natural Language Processing (RANLP-2001), pp. 122–128.
Karlsson, F.; Voutilainen, A.; Heikkilä, J.; Anttila, A. (1995). Constraint grammar: a language-independent system for parsing unrestricted text. Mouton de Gruyer, Berlin.
Mohri, M. (1997). Finite-state transducers in language and speech processing. Computational Linguistics, vol. 23(2), pp. 269–311.
Padró, L. (1996). POS tagging using relaxation labelling. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96).
Viterbi, A.J. (1967). Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans. Information Theory, vol. IT-13 (April).
Voutilainen, A.; Heikkilä, J. (1994). An English constraint grammar (EngCG): a surface-syntactic parser of English. In Fries, Tottie and Schneider (eds.), Creating and using English language corpora, Rodopi.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Graña, J., Andrade, G., Vilares, J. (2003). Compilation of Constraint-Based Contextual Rules for Part-of-Speech Tagging into Finite State Transducers. In: Champarnaud, JM., Maurel, D. (eds) Implementation and Application of Automata. CIAA 2002. Lecture Notes in Computer Science, vol 2608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44977-9_12
Download citation
DOI: https://doi.org/10.1007/3-540-44977-9_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40391-3
Online ISBN: 978-3-540-44977-5
eBook Packages: Springer Book Archive