Abstract
Much recent research activity has focused toward automatically extracting linguistic information from on-line corpora. There is no question that great progress has been made applying machine learning to computational linguistics. We believe now that the field has matured, it is time to look inwards and carefully examine the basic tenets of the corpus-based learning paradigm. The goal of this paper is to raise a number of issues that challenge the paradigm in hopes of stimulating introspection and discussion that will make the field even stronger.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brill, E. (1995). Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 246–253.
Brill, E., & Ngai, G. (1999). Man vs. machine: A case study in base noun phrase learning. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 65–72.
Chanod, J., & Tapanainen, P. (1994). Statistical and constraint-based taggers for French. Technical report MLTT-016, Rank Xerox Research Centre, Grenoble.
Marcus, M., Santorini, B., & Marcinkiewicz, M. (1994). Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19, 313–330.
Ramshaw, L., & Marcus, M. (1995). Text chunking using transformationbased learning. In Proceedings of the third ACL Workshop on Very Large Corpora, pp. 82–94.
Ratnaparkhi, A. (1996). A maximum entropy part of speech tagger. In Proceedings of the First Empirical Methods in Natural Language Processing Conference, pp. 133–142.
Samuelsson, C., & Voutilainen, A. (1997). Comparing a linguistic and stochastic tagger. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pp. 246–253.
Zipf, G. (1932). Selected Studies of the Principle of Relative Frequency in Language. Harvard University Press.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Brill, E. (2000). A Closer Look at the Automatic Induction of Linguistic Knowledge. In: Cussens, J., Džeroski, S. (eds) Learning Language in Logic. LLL 1999. Lecture Notes in Computer Science(), vol 1925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40030-3_3
Download citation
DOI: https://doi.org/10.1007/3-540-40030-3_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41145-1
Online ISBN: 978-3-540-40030-1
eBook Packages: Springer Book Archive