Abstract
We present an adaptation of the Regularized Least-Squares algorithm for the rank learning problem and an application of the method to reranking of the parses produced by the Link Grammar (LG) dependency parser. We study the use of several grammatically motivated features extracted from parses and evaluate the ranker with individual features and the combination of all features on a set of biomedical sentences annotated for syntactic dependencies. Using a parse goodness function based on the F-score, we demonstrate that our method produces a statistically significant increase in rank correlation from 0.18 to 0.42 compared to the built-in ranking heuristics of the LG parser. Further, we analyze the performance of our ranker with respect to the number of sentences and parses per sentence used for training and illustrate that the method is applicable to sparse datasets, showing improved performance with as few as 100 training sentences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Collins, M.: Discriminative reranking for natural language parsing. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning, pp. 175–182. Morgan Kaufmann, San Francisco (2000)
Sleator, D.D., Temperley, D.: Parsing english with a link grammar. Technical Report CMU-CS-91-196, Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA (1991)
Pyysalo, S., Ginter, F., Pahikkala, T., Boberg, J., Järvinen, J., Salakoski, T., Koivula, J.: Analysis of link grammar on biomedical dependency corpus targeted at protein-protein interactions. In: Collier, N., Ruch, P., Nazarenko, A. (eds.) Proceedings of the JNLPBA workshop at COLING 2004, Geneva, pp. 15–21 (2004)
Poggio, T., Smale, S.: The mathematics of learning: Dealing with data. Amer. Math. Soc. Notice 50, 537–544 (2003)
Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995)
Herbrich, R., Graepel, T., Obermayer, K.: Support vector learning for ordinal regression. In: Proceedings of the Ninth International Conference on Artificial Neural Networks, London, UK, pp. 97–102. IEE (1999)
Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the ACM Conference on Knowledge Discovery and Data Mining, pp. 133–142. ACM Press, New York (2002)
Shen, L., Joshi, A.K.: An svm-based voting algorithm with application to parse reranking. In: Daelemans, W., Osborne, M. (eds.) Proceedings of CoNLL-2003, pp. 9–16 (2003)
Collins, M., Koo, T.: Discriminative reranking for natural language parsing, To appear in Computational Linguistics (2004), available at http://people.csail.mit.edu/people/mcollins/papers/collinskoo.ps
Kendall, M.G.: Rank Correlation Methods. 4th edn. Griffin, London (1970)
Lafferty, J., Sleator, D., Temperley, D.: Grammatical trigrams: A probabilistic model of link grammar. In: Proceedings of the AAAI Conference on Probabilistic Approaches to Natural Language, pp. 89–97. AAAI Press, Menlo Park (1992)
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, R. (eds.) Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory, pp. 416–426. Springer, Berlin (2001)
Alpaydin, E.: Combined 5 × 2 cv F-test for comparing supervised classification learning algorithms. Neural Computation 11, 1885–1892 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsivtsivadze, E., Pahikkala, T., Pyysalo, S., Boberg, J., Mylläri, A., Salakoski, T. (2005). Regularized Least-Squares for Parse Ranking. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds) Advances in Intelligent Data Analysis VI. IDA 2005. Lecture Notes in Computer Science, vol 3646. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11552253_42
Download citation
DOI: https://doi.org/10.1007/11552253_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28795-7
Online ISBN: 978-3-540-31926-9
eBook Packages: Computer ScienceComputer Science (R0)