Abstract
It has been noted before in this book that patent retrieval is different from, and more complicated than “standard” information retrieval. Evaluation of patent retrieval engines has also been shown to require specific attention. In this chapter, we continue making this point, but emphasize the efforts undertaken in a specific domain, namely chemistry. We approached this issue from two different perspectives. First, there is the issue of scalability. Largely similar to the CLEF-IP efforts, it targets the problem of having to handle a large number of documents and, potentially, a large number of queries. Second, there are the issues generated by the specific characteristics of chemistry documents. We describe here how we manually created a set of topics to reflect the kind of requests for information that a patent searcher, or a general researcher, might have. The results of the first year’s track are presented as well, together with directions and desiderata for the next years.
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
The MAREC DTDs are publicly available together with the MAREC data, conditioned on the signing of a license agreement.
- 6.
To note that we refer to as a ‘topic’ what we give the participants, and as a ‘query’ what they actually put into their system to obtain results.
References
Cetintas S, Si L (2009) Strategies for effective chemical information retrieval. In: Proc of TREC
Gobeill J, Teodoro D, Patsche E, Ruch P (2009) Report on the TREC 2009 experiments: Chemical IR track. In: Proc of TREC
Gurulingappa H, Müller B, Klinger R, Mevissen HT, Hofmann-Apitius M, Fluck J, Friedrich C (2009) Patent retrieval in chemistry based on semantically tagged named entities. In: Proc. of TREC
Hersh W, Voorhees E (2008) TREC genomics special issue overview. Inf Retr
Hirschman L, Yeh A, Blaschke C, Valencia A (2005) Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC Bioinf 6(S1)
Jin S, Ye Z, Lin H (2009) DUTIR at TREC 2009: Chemical IR track. In: Proc of TREC
Jones KS (1981) Information retrieval experiment. Butterworths, Stoneham
Lupu M, Piroi F, Huang J, Zhu J, Tait J (2009) Overview of the TREC chemical IR track. In: Proc of TREC
Lupu M, Huang J, Zhu J, Tait J TREC chemical information retrieval—an evaluation effort for chemical IR systems. World Pat Inf, to appear
Lupu M, Piroi F, Hanbury A (2010) Aspects and analysis of patent test collections. In: Proc of PaIR
Mejova Y, Thuc VH, Foster S, Harris C, Arens B, Srinivasan P (2009) TREC blog and TREC chem: a view from the corn fields. In: Proc of TREC
Pubmed central. http://www.ncbi.nlm.nih.gov/pmc/
Soboroff I (2010) Test collection diagnosis and treatment. In: Proc of EVIA
Urbain J (2009) TREC chemical IR track 2009: a distributed dimensional indexing model for chemical patent search. In: Proc of TREC
Voorhees E, Harman D (eds) (2005) TREC experiment and evaluation in information retrieval. MIT Press, Cambridge
Yilmaz E, Kanoulas E, Aslam JA (2008) A simple and efficient sampling method for estimating AP and NDCG. In: SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 603–610. http://doi.acm.org/10.1145/1390334.1390437
Zhao J, Huang X, Ye Z, Zhu J (2009) York University at TREC 2009: Chemical track. In: Proc of TREC
Zhao L, Callan J (2009) Formulating simple structured queries using temporal and distributional cues in patents. In: Proc of TREC
Acknowledgements
The authors would like to thank the NIST TREC organizers for supporting this evaluation campaign, Matrixware Information Services GmBH for the patent corpus, Richard Kidd from the Royal Society of Chemistry for providing the initial collection of scientific articles, and all the other editors of the journals that have provided articles in the second year campaign. Last, but certainly not least, the authors express their gratitude to the domain experts who volunteered to provide the manual topics and to evaluate the results of the participants: Teresa Loughbrough, Henk Tomas, Monika Hanelt, Anthony Trippe, Madeleine Marley and her team, and Carlos Faerman.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Lupu, M., Huang, J., Zhu, J. (2011). Evaluation of Chemical Information Retrieval Tools. In: Lupu, M., Mayer, K., Tait, J., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 29. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19231-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-19231-9_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19230-2
Online ISBN: 978-3-642-19231-9
eBook Packages: Computer ScienceComputer Science (R0)