Are Semantically Related Links More Effective for Retrieval?

Koolen, Marijn; Kamps, Jaap

doi:10.1007/978-3-642-20161-5_11

Marijn Koolen²¹ &
Jaap Kamps^21,22

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Included in the following conference series:

European Conference on Information Retrieval

6848 Accesses
3 Citations

Abstract

Why do links work? Link-based ranking algorithms are based on the often implicit assumption that linked documents are semantically related to each other, and that link information is therefore useful for retrieval. Although the benefits of link information are well researched, this underlying assumption on why link evidence works remains untested, and the main aim of this paper is to do exactly that. Specifically, we use Wikipedia because it has a dense link structure in combination with a large category structure, which allows for an independent measurement of the semantic relatedness of linked documents. Our main findings are that: 1) global, query-independent link evidence, is not affected by the semantic nature of the links, and 2) for local, query-dependent link evidence, the effectiveness of links increases as their semantic distance decreases. That is, we directly observe that links between semantically related pages are more effective for ad hoc retrieval than links between unrelated ones. These findings confirm and quantify the underlying assumption of existing link-based methods, which sheds further light on our understanding of the nature of link evidence. Such deeper understanding is instrumental for the development of novel link-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Link Analysis

Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links

Ad hoc retrieval via entity linking and semantic similarity

Article 21 April 2018

References

Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)
Article MATH Google Scholar
Cohen, P.R., Kjeldsen, R.: Information retrieval by constrained spreading activation in semantic networks. Inf. Process. Manage. 23(4), 255–268 (1987)
Article Google Scholar
Craswell, N., Robertson, S., Zaragoza, H., Taylor, M.: Relevance weighting for query independent evidence. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 416–423. ACM, New York (2005) ISBN 1-59593-034-5
Google Scholar
Davison, B.D.: Topical locality in the web. In: Research and Development in Information Retrieval (SIGIR), pp. 272–279 (2000)
Google Scholar
Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus. SIGIR Forum. 40(1), 64–69 (2006)
Article Google Scholar
Fuhr, N., Kamps, J., Lalmas, M., Malik, S., Trotman, A.: Overview of the INEX 2007 ad hoc track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 1–23. Springer, Heidelberg (2008)
Chapter Google Scholar
Kamps, J., Koolen, M.: Is Wikipedia link structure different? In: Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM 2009), pp. 232–241. ACM Press, New York (2009)
Chapter Google Scholar
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Article MathSciNet MATH Google Scholar
Koolen, M., Kamps, J.: What’s in a link? from document importance to topical relevance. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 313–321. Springer, Heidelberg (2009)
Chapter Google Scholar
Kurland, O., Lee, L.: Pagerank without hyperlinks: structural re-ranking using links induced by language models. In: SIGIR, pp. 306–313. ACM, New York (2005)
Google Scholar
Kurland, O., Lee, L.: Respect my authority!: Hits without hyperlinks, utilizing cluster-based language models. In: SIGIR, pp. 83–90. ACM, New York (2006)
Google Scholar
Lempel, R., Moran, S.: Salsa: the stochastic approach for link-structure analysis. ACM Trans. Inf. Syst. 19(2), 131–160 (2001)
Article Google Scholar
Malik, S., Trotman, A., Lalmas, M., Fuhr, N.: Overview of INEX 2006. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, pp. 1–11. Springer, Heidelberg (2007)
Chapter Google Scholar
Najork, M.: Comparing the effectiveness of hits and salsa. In: CIKM, pp. 157–164. ACM, New York (2007)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Google Scholar
Picard, J., Savoy, J.: Enhancing retrieval with hyperlinks: A general model based on propositional argumentation systems. JASIST 54(4), 347–355 (2003)
Article Google Scholar
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man, and Cybernetics 19(1), 17–30 (1989)
Article Google Scholar
Resnik, P.: Using information content to evaluate semantic similarity in a taxanomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI 1995), pp. 448–453 (1995)
Google Scholar
Shakery, A., Zhai, C.: A probabilistic relevance propagation model for hypertext retrieval. In: CIKM, pp. 550–558. ACM, New York (2006)
Google Scholar
Strube, M., Ponzetto, S.P.: Wikirelate! computing semantic relatedness using wikipedia. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence (July 2006)
Google Scholar
Zesch, T., Gurevych, I.: Analysis of the wikipedia category graph for nlp applications. In: Proceedings of the TextGraphs-2 Workshop (NAACL-HLT 2007), pp. 1–8 (April 2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Archives and Information Studies, University of Amsterdam, The Netherlands
Marijn Koolen & Jaap Kamps
ISLA, University of Amsterdam, The Netherlands
Jaap Kamps

Authors

Marijn Koolen
View author publications
You can also search for this author in PubMed Google Scholar
Jaap Kamps
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information School, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK
Paul Clough
CLARITY: Centre for Sensor Web Technologies, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Colum Foley , Cathal Gurrin & Hyowon Lee , &
Centre for Next Generation Localisation, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Gareth J. F. Jones
TNO Human Factors, Brassersplein 2, 2612 CT, Delft, The Netherlands
Wessel Kraaij
Yahoo! Research, 177 Diagonal, 08018, Barcelona, Spain
Vanessa Mudoch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koolen, M., Kamps, J. (2011). Are Semantically Related Links More Effective for Retrieval?. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-20161-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20160-8
Online ISBN: 978-3-642-20161-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics