Abstract
In this paper we describe our participation in INEX 2010 in the Ad Hoc Track and the Book Track. In the Ad Hoc track we investigate the impact of propagated anchor-text on article level precision and the impact of an element length prior on the within-document precision and recall. Using the article ranking of an document level run for both document and focused retrieval techniques, we find that focused retrieval techniques clearly outperform document retrieval, especially for the Focused and Restricted Relevant in Context Tasks, which limit the amount of text than can be returned per topic and per article respectively. Somewhat surprisingly, an element length prior increases within-document precision even when we restrict the amount of retrieved text to only 1000 characters per topic. The query-independent evidence of the length prior can help locate elements with a large fraction of relevant text. For the Book Track we look at the relative impact of retrieval units based on whole books, individual pages and multiple pages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fachry, K.N., Kamps, J., Koolen, M., Zhang, J.: Using and detecting links in wikipedia. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 388–403. Springer, Heidelberg (2008)
Hiemstra, D., Robertson, S., Zaragoza, H.: Parsimonious language models for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM Press, New York (2004)
Kamps, J., Koolen, M.: The impact of document level ranking on focused retrieval. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 140–151. Springer, Heidelberg (2009)
Kamps, J., Koolen, M.: On the relation between relevant passages and XML document structure. In: Trotman, A., Geva, S., Kamps, J. (eds.) SIGIR 2007 Workshop on Focused Retrieval, pp. 28–32. University of Otago, Dunedin (2007)
Kamps, J., Koolen, M., Sigurbjörnsson, B.: Filtering and clustering XML retrieval results. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, pp. 121–136. Springer, Heidelberg (2007)
Kamps, J., Koolen, M., Lalmas, M.: Locating relevant text within XML documents. In: Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 847–849. ACM Press, New York (2008)
Kamps, J., Geva, S., Trotman, A., Woodley, A., Koolen, M.: Overview of the INEX 2008 ad hoc track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 1–28. Springer, Heidelberg (2009)
Kaptein, R., Koolen, M., Kamps, J.: Using Wikipedia categories for ad hoc search. In: Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press, New York (2009)
Kazai, G., Milic-Frayling, N., Costello, J.: Towards methods for the collective gathering and quality control of relevance assessments. In: SIGIR 2009: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 452–459. ACM, New York (2009), doi: http://doi.acm.org/10.1145/1571941.1572019 ISBN 978-1-60558-483-6
Koolen, M., Kamps, J.: The importance of anchor-text for ad hoc search revisited. In: Chen, H.-H., Efthimiadis, E.N., Savoy, J., Crestani, F., Marchand-Maillet, S. (eds.) Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 122–129. ACM Press, New York (2010)
Koolen, M., Kaptein, R., Kamps, J.: Focused search in books and wikipedia: Categories, links and relevance feedback. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 273–291. Springer, Heidelberg (2010)
Schenkel, R., Suchanek, F., Kasneci, G.: Yawn: A semantically annotated wikipedia xml corpus (2007), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.140.5501
Sigurbjörnsson, B., Kamps, J.: The effect of structured queries and selective indexing on XML retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 104–118. Springer, Heidelberg (2006)
Sigurbjörnsson, B., Kamps, J., de Rijke, M.: An Element-Based Approach to XML Retrieval. In: INEX 2003 Workshop Proceedings, pp. 19–26 (2004)
Sigurbjörnsson, B., Kamps, J., de Rijke, M.: Mixture models, overlap, and structural hints in XML element retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 196–210. Springer, Heidelberg (2005)
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language-model based search engine for complex queries. In: Proceedings of the International Conference on Intelligent Analysis (2005)
Vercoustre, A.-M., Pehcevski, J., Thom, J.A.: Using Wikipedia categories and links in entity ranking. In: Focused Access to XML Documents, pp. 321–335 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kamps, J., Koolen, M. (2011). Focus and Element Length for Book and Wikipedia Retrieval. In: Geva, S., Kamps, J., Schenkel, R., Trotman, A. (eds) Comparative Evaluation of Focused Retrieval. INEX 2010. Lecture Notes in Computer Science, vol 6932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23577-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-23577-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23576-4
Online ISBN: 978-3-642-23577-1
eBook Packages: Computer ScienceComputer Science (R0)