Abstract
During a two-day strategic workshop in February 2018, 22 information retrieval researchers met to discuss the future challenges and opportunities within the field. The outcome is a list of potential research directions, project ideas, and challenges. This report describes the major conclusions we have obtained during the workshop. A key result is that we need to open our mind to embrace a broader IR field by rethink the definition of information, retrieval, user, system, and evaluation of IR. By providing detailed discussions on these topics, this report is expected to inspire our IR researchers in both academia and industry, and help the future growth of the IR research community.
Similar content being viewed by others
References
Bush V. As we may think. The Atlantic Monthly, 1945, 176(1): 101–108
Clarke C. From the chair… ACM SIGIR Forum, 2016, 50(1): 1
Zobel J, Moffat A. Inverted files for text search engines. ACM Computing Surveys (CSUR), 2006, 38(2): 6
Salton G, Wong A, Yang C S. A vector space model for automatic indexing. Communications of the ACM, 1975, 18(11): 613–620
Robertson S, Zaragoza H. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 2009, 3(4): 333–389
Lv Y, Zhai C. Positional language models for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2009, 299–306
Zhai C, Lafferty J. A study of smoothing methods for language models applied to ad hoc information retrieval. ACM SIGIR Forum, 2017, 51(2): 268–276
Page L, Brin S, Motwani R, Winograd T. The pagerank citation ranking: bringing order to the web. Technical Report, Stanford InfoLab, 1999
Kleinberg J M. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999, 46(5): 604–632
Chen C P, Zhang C Y. Data-intensive applications, challenges, techniques and technologies: a survey on big data. Information Sciences, 2014, 275: 314–347
Sanderson M, Croft W B. The history of information retrieval research. Proceedings of the IEEE, 2012, 100 (Special Centennial Issue): 1444–1451
Chaudhuri S, Dayal U. An overview of data warehousing and olap technology. ACM Sigmod Record, 1997, 26(1): 65–74
Borlund P. The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Information Research, 2003, 8(3): 289–291
Hinton G, Deng L, Yu D, Dahl G, Mohamed A R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Kingsbury B. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 2012, 29(6): 82–97
LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 1995, 3361(10): 1995
Socher R, Huang E H, Pennin J, Manning C D, Ng A Y. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Proceedings of Advances in Neural Information Processing Systems. 2011, 801–809
Craswell N, Croft W B, Guo J, Mitra B, de Rijke M. Neu-IR: the SIGIR 2016 workshop on neural information retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016, 1245–1246
Craswell N, Croft W B, de Rijke M, Guo J, Mitra B. SIGIR 2017 workshop on neural information retrieval (Neu-Ir’17). In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 1431–1432
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems. 2014, 2672–2680
Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529–533
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, Driessche G V D, Graepel T, Hassabis D. Mastering the game of go without human knowledge. Nature, 2017, 550(7676): 354
Wang J, Yu L, Zhang W, Gong Y, Xu Y, Wang B, Zhang P, Zhang D. Irgan: a minimax game forunifying generative and discriminative information retrieval models. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 515–524
Agichtein E, Brill E, Dumais S. Improving web search ranking by incorporating user behavior information. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2006, 19–26
Granka L A, Joachims T, Gay G. Eye-tracking analysis of user behavior in www search. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2004, 478–479
Morris M R, Teevan J, Panovich K. What do people ask their social networks, and why?: a survey study of status message q&a behavior. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2010, 1739–1748
Croft W B, Cronen-Townsend S, Lavrenko V. Relevance feedback and personalization: a language modeling perspective. In: Proceedings of the 2nd DELOS Network of Excellence Workshop on Personalisation and Recommender Systems in Digital Libraries. 2001
Thomee B, Lew M S. Interactive search in image retrieval: a survey. International Journal of Multimedia Information Retrieval, 2012, 1(2): 71–86
Said A, Jain B J, Narr S, Plumbaum T. Users and noise: the magic barrier of recommender systems. In: Proceedings of International Conference on User Modeling, Adaptation, and Personalization. 2012, 237–248
Swan M. Blockchain: Blueprint for a New Economy. O’Reilly Media, Inc., 2015
Akyildiz I F, Akan Ö B, Chen C, Fang J, Su W. Interplanetary internet: state-of-the-art and research challenges. Computer Networks, 2003, 43(2): 75–112
Lavanya B M. Blockchain technology beyond bitcoin: an overview. International Journal of Computer Science and Mobile Applications, 2018, 6(1): 76–80
Seebacher S, Schüritz R. Blockchain technology as an enabler of service systems: a structured literature review. In: Proceedings of International Conference on Exploring Services Science. 2017, 12–23
Croft W B, Metzler D, Strohman T. Search Engines: Information Retrieval in Practice. Addison-Wesley Reading, 2010
Voorhees E M, Harman D K. TREC: Experiment and Evaluation in Information Retrieval. Cambridge: MIT Press, 2005
Kelly D. Methods for evaluating interactive information retrieval systems with users. Foundations and Trends®R in Information Retrieval, 2009, 3(1–2): 1–224
Ellis D. Theory and explanation in information retrieval research. Journal of Information Science, 1984, 8(1): 25–38
Vakkari P, Järvelin K. Explanation in information seeking and retrieval. New Directions in Cognitive Information Retrieval, 2006, 19: 113–138
Singh J, Anand A. EXS: explainable search using local model agnostic interpretability. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 770–773
Luo G, Tang C, Yang H, Wei X. Medsearch: a specialized search engine for medical information retrieval. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008, 143–152
Huang P S, He X, Gao J, Deng L, Acero A, Heck L. Learning deep structured semantic models for Web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 2013, 2333–2338
Guo J, Fan Y, Ai Q, Croft W B. A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 2016, 55–64
Zhang Y, Rahman M M, Braylan A, Dang B, Chang H L, Kim H, Mc-Namara Q, Angert A, Banner E, Khetan V, McDonnell T, Nguyen A T, Xu D, Wallace B C, Leasey M. Neural information retrieval: a literature review. 2016, arXiv preprint arXiv:1611.06792
Mitra B, Craswell N. Neural models for information retrieval. 2017, arXiv preprint arXiv:1705.01509
Guo J, Fan Y, Pang L, Yang L, Ai Q, Zamani H, Wu C, Croft W B, Cheng X. A deep look into neural ranking models for information retrieval. 2019, arXiv preprint arXiv:1903.06902
Sharma D, Kumar S, Kholia C. Multi-modal information retrieval system. US Patent 7,054,818, 2006
Lee D, Park J, Ahn J H. On the explanation of factors affecting ecommerce adoption. In: Proceedings of the International Conference on Information Systems. 2001, 109–120
Jamali M, Ester M. A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 135–142
Callison-Burch C. Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 286–295
Gubbi J, Buyya R, Marusic S, Palaniswami M. Internet of Things (IoT): a vision, architectural elements, and future directions. Future Generation Computer Systems, 2013, 29(7): 1645–1660
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray D G, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X. Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. 2016, 265–283
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia. 2014, 675–678
Paszke A, Gross S, Chintala S, Chanan G. Pytorch: tensors and dynamic neural networks in python with strong GPU acceleration. 2017
McCandless M, Hatcher E, Gospodnetic O. Lucene in Action: Covers Apache Lucene 3.0. Greenwich, CT: Manning Publications Co., 2010
Acknowledgements
We would like to thank Chinese Information Processing Society of China, CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, and ACM SIGIR Beijing Chapter for suporting the strategic workshop. Thank Professor Bo Zhang (Tsinghua University) and Ming Zhou (Microsoft Research Asia) for contributing the keynotes and valuable discussions in the workshop.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Zhumin Chen is an associate professor in School of Computer Science and Technology, Shandong University, China. His research interests include information retrieval and natural language processing. His research is supported by the Natural Science Fund of China, Key Science and Technology Innovation Project of Shandong Province, etc.
Xueqi Cheng is a full professor and vice director of the Institute of Computing Technology, Chinese Academy of Sciences (CAS), and the director of the CAS Key Laboratory of Network Data Science and Technology, China. His research areas include Web search and data mining, data science, and social media analytics. He is the general secretary of CCF Committee on Big Data, the vice-chair of CIPS Committee on Information Retrieval, the general co-chair of SIGIR’20 and WSDM’15. He is the principal investigator of more than 10 major research projects, funded by NSFC and MOST. He was awarded the NSFC Distinguished Youth Scientist (2014), the National Prize for Progress in Science and Technology (2012), the China Youth Science and Technology Award (2011).
Shoubin Dong received the PhD degree in electronic engineering from the University of Science and Technology of China (USTC), China in 1994. She was a visiting scholar at the School of Computer Science and Language Technology Institute, Carnegie Mellon University (CMU), Pittsburgh, USA from 2001 to 2002. She is now a professor with the School of Computer Science and Engineering, South China University of Technology (SCUT), China. Her research interests include information retrieval, natural language processing, high-performance computing, etc.
Zhicheng Dou is currently a professor at School of Information, Renmin University of China, China. He received his PhD and BS degrees in computer science and technology from the Nankai University, China in 2008 and 2003, respectively. He worked at Microsoft Research Asia from July 2008 to September 2014. His current research interests are information retrieval, natural language processing, and big data analytics.
Jiafeng Guo is a professor in Institute of Computing Technology, Chinese Academy of Sciences, and University of Chinese Academy of Sciences, China. He has worked on a number of topics related to web search and data mining. His current research is focused on representation learning and neural models for information retrieval and filtering. He has published more than 80 papers in several top conferences/journals. His work on IR has received the Best Paper Award in ACM CIKM (2011), Best Student Paper Award in ACM SIGIR (2012) and Best Full Paper Runner-up Award in ACM CIKM (2017). Moreover, he has served as the PC member for the prestigious conferences including SIGIR, WWW, KDD, WSDM, and ACL, and the associate editor of TOIS.
Xuanjing Huang is a professor of the School of Computer Science, Fudan University, China. Her research interest includes natural language processing, information retrieval, artificial intelligence, deep learning and data intensive computing. She has published more than 100 papers in major conferences including ACL, SIGIR, IJCAI, AAAI, NIPS, ICML, CIKM, EMNLP, WSDM, and COLING. In the research community, she served as the PC Co-Chair of CCL 2019, NLPCC 2017, CCL 2016, SMP 2015, and SMP 2014, the organizer of WSDM 2015, competition chair of CIKM 2014, tutorial chair of COLING 2010, SPC or PC member of past WSDM, SIGIR, WWW, CIKM, ACL, IJCAI, KDD, EMNLP, COLING, and many other conferences.
Yanyan Lan is a professor in Institute of Computing Technology, Chinese Academy of Sciences, China. She leads a research group working on Big Data and Machine Learning. Her current research interests include machine learning, information retrieval and natural language processing. From April 2018 to March 2019, she acted as a visiting scholar in the department of statistics, UC Berkeley. She has published over 70 papers on top conferences including ICML, NIPS, SIGIR, WWW, etc. Her paper entitled “Top-k Learning to Rank: Labeling, Ranking, and Evaluation” has won the Best Student Paper Award of SIGIR 2012, and the paper entitled “Learning Visual Features from Snapshots for Web Search” has won the Best Paper RunnerUp Award of CIKM2017.
Chenliang Li received PhD from Nanyang Technological University, Singapore in 2013. Currently, he is an associate professor at School of Cyber Science and Engineering, Wuhan University, China. His research interests include information retrieval, text/web mining, data mining and natural language processing. He is a co-recipient of Best Student Paper Award Honorable Mention in ACM SIGIR 2016, and serves as an editorial board member of JASIST and IPM.
Ru Li, Professor, PhD Supervisor. She is the Vicedean of the School of Computer and Information Technology, and the School of Big Data of Shanxi University, the standing council member of Chinese Information Processing Society (CIPS), the committee member of Computational Linguistics, Information Retrieval, and Language and Knowledge Computing of CIPS. Her research interests include Chinese information processing and information retrieval. She has published more than 70 papers in both international and national important academic journals and conferences, including in the IEEE Transactions on Knowledge and Data Engineering, the Annual Meeting of the Association for Computational Linguistics, and the International Conference on Computational Linguistics, and so on. She has won three Second Prize for scientific and technological progress in Shanxi.
Tie-Yan Liu, assistant managing director of Microsoft Research Asia, fellow of the IEEE, and distinguished member of the ACM. He is an adjunct professor at Carnegie Mellon University (CMU) and Tsinghua University. His research interest includes learning to rank, deep learning, reinforcement learning, and distributed learning. He has served as general chair, program committee chair, local chair, or area chair for a dozen of international conferences including WWW/WebConf, SIGIR, KDD, ICML, NIPS, IJCAI, AAAI, ACL, ICTIR, as well as associate editor of ACM Transactions on Information Systems, ACM Transactions on the Web, and Neurocomputing.
Yiqun Liu is professor and Department co-Chair at the Department of Computer Science and Technology in Tsinghua University, China. His major research interests are in Web search, user behavior analysis, and natural language processing. He also works as a visiting research professor of National University of Singapore and a visiting professor of National Institute of Informatics (NII) of Japan, as well as a member of Tiangong AI Research Center which is founded by Tsinghua and Sogou Inc.
Jun Ma received his BSc, MSc, and PhD degrees from Shandong University in China, Ibaraki University and Kyushu University in Japan, respectively. He is a full professor at Shandong University. He was a senior researcher in the Department of Computer Science at Ibaraki University in 1994 and German GMD and Fraunhofer from the year 1999 to 2003. His research interests include information retrieval, Web data mining, recommendation systems, and machine learning. He has published more than 150 papers in the international journals and conferences, including SIGIR, WWW, MM, TOIS, and TKDE. He is a member of the ACM and IEEE.
Bing Qin, a professor and doctoral supervisor of the School of Computer Science and Technology, at Harbin Institute of Technology, China. Her main research directions are natural language processing, information extraction, text mining, sentiment analysis, etc. She has published more than 80 papers in the several international top journals and conferences such as ACL, COLING, EMNLP, IEEE TKDE, IEEE TASLP, etc. She has leaded over several the National Natural Science Foundations of China, as well as the key research and development projects of the National Science and Technology Ministry. She was awarded the first prize of Qian Weichang Chinese Information Processing Science and Technology Award by the Chinese Information Processing Society and the second prize of Heilongjiang Province Technical Invention.
Mingwen Wang is currently a professor of Jiangxi Normal University, China. He received the BS (1985) and MS (1988) degrees in mathematics from Jiangxi Normal University, China, and the PhD (2000) degree in computer science from Shanghai Jiaotong University, China. His research interests include information retrieval, natural language processing, and machine learning.
Jirong Wen is a professor and the dean of School of Information, Renmin University of China (RUC), China. He is also the Executive Dean of Gaoling School of Artificial Intelligence, and Director of Beijing Key Laboratory of Big Data Research. He received his PhD degree in 1999 from the Institute of Computing Technology, the Chinese Academy of Science, China. His main research interests include information retrieval, data mining and machine learning. He worked at Microsoft Research Asia (MSRA) for 14 years and once was the group manager of the Web Search and Mining Group.
Jun Xu is a professor with the School of Information, Renmin University of China, China. His research interests include learning to rank and semantic matching in search. He has published more than 50 papers in international conferences (e.g., SIGIR, WSDM) and journals (e.g., ACM TOIS, IEEE TKDE). He has won the Best Paper Award in AIRS (2010), Best Paper Runner-up in CIKM (2017), and Test of Time Award Honorable Mention in SIGIR (2019). He is serving as SPC for SIGIR, WWW, AAAI, and ACML, editorial board member for JASIST, and associate editor for ACM TIST.
Min Zhang is a tenured associate professor in the Department of Computer Science & Technology, Tsinghua University, China, specializes in Web search and recommendation, and user modeling. She is the vice director of Intelligent Technology & Systems lab at CS Dept., and vice director of Intelligent Information Acquisition, AI Institute, Tsinghua University. She also serves as ACM SIGIR Executive Committee Member, Associate Editor for the ACM Transaction of Information Systems (TOIS), Web Mining and Content Analysis Track Chair of theWebConf 2020, Short Paper Chair of SIGIR 2018, Program Chair of WSDM 2017, etc. She also owns 12 patents.
Peng Zhang is an associate professor and Vice Dean of School of Computer Science and Technology, College of Computing and Intelligence, Tianjin University, China. He obtained his PhD at Robert Gordon University, United Kingdom in 2013. His research is focused on the tensor space language models, explainable neural network design and quantum theory inspired language models. He has published papers on refereed journals such as IEEE TNN, IEEE TKDE, ACM TIST, ACM TALIP, JASIST, IP&M, and on top conferences such as NeurIPS, SIGIR, ACL, AAAI, IJCAI, CIKM, WWW, and EMNLP. He won ECIR 2011 Best Poster Award and SIGIR 2017 Best Paper Award Honorable Mention.
Qi Zhang received the PhD degree in computer science from Fudan University, China. He is a professor of computer science at Fudan University, China. His research interests include natural language processingand information retrieval.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Chen, Z., Cheng, X., Dong, S. et al. Information retrieval: a view from the Chinese IR community. Front. Comput. Sci. 15, 151601 (2021). https://doi.org/10.1007/s11704-020-9159-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-020-9159-0