Abstract
In recent years, there has been a considerable amount of interest in using Natural Language Processing in Information Retrieval research, with specific implementations varyingfrom the word-level morphological analysis to syntactic parsing to conceptual-level semantic analysis. In particular, different degrees of phrase-level syntactic information have been incorporated in information retrieval systems workingon English or Germanic languages such as Dutch. In this paper we study the impact of usingsuc h information, in the form of syntactic dependency pairs, in the performance of a text retrieval system for a Romance language, Spanish.
The research reported in this article has been supported in part by Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica (Grant TIC2000- 0370-C02-01), Ministerio de Ciencia y Tecnología (Grant HP2001-0044) and Xunta de Galicia (Grant PGIDT01PXI10506PN).
Preview
Unable to display preview. Download preview PDF.
References
C. Aone, L. Halverson, T. Hampton, and M. Ramos-Santacruz. SRA: Description of the IE2 system used for MUC-7. In Proc. of the MUC-7, 1998.
A. Arampatzis, T. van der Weide, C. Koster, and P. van Bommel. Linguistically motivated information retrieval. In Encyclopedia of Library and Information Science. Marcel Dekker, Inc., New York and Basel, 2000.
R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. Addison-Wesley, Harlow, England, 1999.
C. Buckley, J. Allan, and G. Salton. Automatic routingand ad-hoc retrieval using SMART: TREC 2. In D. K. Harman, editor, Proc. of TREC-2, pages 45–56, Gaithersburg, MD, USA, 1993.
J. Carrol, T. Briscoe, and A. Sanfilippo. Parser evaluation: a survey and a new proposal. In Proc. of LREC’98, pages 447–454, Granada, Spain, 1998.
M. Dillon and A. S. Gray. FASIT: A fully automatic syntactically based indexing system. Journal of the American Society for Information Science, 34(2):99–108, 1983.
J. L. Fagan. Automatic phrase indexing for document retrieval: An examination of syntactic and non-syntactic methods. In Proc. of SIGIR’87, pages 91–101, 1987.
C. G. Figuerola, R. Gómez, A. F. Zazo, and J. L. Alonso. Stemmingin Spanish: A first approach to its impact on information retrieval. In [17].
R. Grishman. The NYU system for MUC-6 or where’s the syntax? In Proc. of MUC-6. Morgan Kaufmann Publishers, 1995.
J. R. Hobbs, D. Appelt, J. Bear, D. Israel, M. Kameyama, M. Stickel, and M. Tyson. FASTUS: A cascaded finite-state transducer for extractinginformation from natural-language text. In E. Roche and Y. Schabes, editors, Finite-State Language Processing. MIT Press, Cambridge, MA, USA, 1997.
C. Jacquemin and E. Tzoukermann. NLP for term variant extraction: synergy between morphology, lexicon and syntax. In T. Strzalkowski, editor, Natural Language Information Retrieval, pages 25–74. Kluwer Academic Publishers, Dordrecht/Boston/London, 1999.
J. S. Justeson and S. M. Katz. Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering, 1:9–27, 1995.
W. Kraaij and R. Pohlmann. Comparingthe effect of syntactic vs. statistical phrase indexingstrateg ies for Dutch. In C. Nicolaou and C. Stephanidis, editors, Research and Adavanced Technology for Digital Libraries, volume 1513 of LNCS, pages 605–614. Springer-Verlag, Berlin/Heidelberg/New York, 1998.
B.-K. Kwak, J.-H. Kim, G. Lee, and J. Y. Seo. Corpus-based learningof compound noun indexing. In J. Klavans and J. Gonzalo, editors, Proc. of the ACL’2000 workshop on Recent Advances in Natural Language Processing and Information Retrieval, HongKong, October 2000.
M. Mittendorfer and W. Winiwarter. Exploitingsyn tactic analysis of queries for information retrieval. Data & Knowledge Engineering, 2002.
J. Perez-Carballo and T. Strzalkowski. Natural language information retrieval: progress report. Information Processing and Management, 36(1):155–178, 2000.
C. Peters, editor. Working Notes for the CLEF 2001 Workshop. Darmstadt, Germany, 2001. Available at http://www.clef-campaign.org.
J. Vilares, D. Cabrero, and M. A. Alonso. Applyingpro ductive derivational morphology to term indexing of Spanish texts. In Alexander Gelbukh, editor, Computational Linguistics and Intelligent Text Processing, volume 2004 of LNCS, pages 336–348. Springer-Verlag, Berlin-Heidelberg-New York, 2001.
J. Vilares, M. Vilares, and M. A. Alonso. Towards the development of heuristics for automatic query expansion. In H. C. Mayr, J. Lazansky, G. Quirchmayr, and P. Vogel, editors, Database and Expert Systems Applications, volume 2113 of LNCS, pages 887–896. Springer-Verlag, Berlin-Heidelberg-New York, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alonso, M.A., Vilares, J., Darriba, V.M. (2002). On the Usefulness of Extracting Syntactic Dependencies for Text Indexing. In: O’Neill, M., Sutcliffe, R.F.E., Ryan, C., Eaton, M., Griffith, N.J.L. (eds) Artificial Intelligence and Cognitive Science. AICS 2002. Lecture Notes in Computer Science(), vol 2464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45750-X_1
Download citation
DOI: https://doi.org/10.1007/3-540-45750-X_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44184-7
Online ISBN: 978-3-540-45750-3
eBook Packages: Springer Book Archive