计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 247-249.doi: 10.11896/j.issn.1002-137X.2016.09.049
胡博,蒋宗礼
HU Bo and JIANG Zong-li
摘要: 文档检索结果的排序和文本分类技术是解决垂直搜索、个性化信息检索、信息过滤等相关问题的核心技术。为了提高检索系统的性能,针对Lucene的基础排序算法,提出了一种融合位置相关和概率排序的改进方法。考虑到查询词在文档中出现的位置信息和概率排序对文档相关性的影响,利用位置相关的查询词权值和基于朴素贝叶斯分类算法的文档相关性概率值,对Lucene基础排序算法的评分公式进行改进。实验表明,该改进方法能够有效提高垂直搜索的准确率,使用户拥有更好的垂直搜索体验。
[1] Liu J X,Sheng Y.The differences and case analysis of vertical and general search engines[J].Modern Information,2009,9(3):143-149(in Chinese) 刘俊熙,盛宇.垂直和通用搜索引擎的差异和案例分析[J].现代情报,2009,9(3):143-149 [2] 牛长流,尚宇.Lucene实战(第2版)[M].北京:人民邮电出版社,2011 [3] Bai K,Geng G H.Research and Application of vertical search engines based on Lucene/Heritrix[J].Computer Applications and Software,2009,6(1):212-215(in Chinese) 白坤,耿国华.基于Lucene/Heritrix的垂直搜索引擎的研究与应用[J].计算机应用与软件,2009,6(1):212-215 [4] Zhang X,Liu X F.Design and implementation of full-text search engine based on Lucene and Heritrix[J].Modern Computer ,2013(22):74-77(in Chinese) 张宣,刘晓飞.基于Lucene和Heritrix的全文搜索引擎的设计与实现[J].现代计算机,2013(22):74-77 [5] Cai F.Research and improvement of Lucene sorting algorithm[J].New Technology and New Products of China,2011(4):15-16(in Chinese) 蔡峰.Lucene排序算法的研究和改进[J].中国新技术新产品,2011(4):15-16 [6] Chen J X,Huang R,Ma Z B.Optimization and implementation of Lucene sorting algorithm based on PageRank[J].Computer Engineering and Science,2012,4(10):123-127(in Chinese) 陈建峡,黄日,马忠宝.基于PageRank的Lucene排序算法优化与实现[J].计算机工程与科学,2012,4(10):123-127 [7] Mohd M.Development of Search Engines using Lucene:An Experience[J].Procedia-Social and Behavioral Sciences,2011,8:282-286 [8] Milosavljevic,Branko,Boberic,et al.Retrieval of bibliographic records using Apache Lucene[J].The Electronic Library,2010,8(4):525-539 [9] Rong G,Zhang H X.Application of text classification in thesearch engine[J].Guide of Scitech Magazine,2008,2(2):14-15(in Chinese) 荣光,张化祥.文本分类在搜索引擎性能中的应用[J].科技致富向导,2008,2(2):14-15 [10] Lewis D D.Representation and learning in information retrieval[D].Graduate School of the University of Maassachusetts,1992 [11] Zhang X F.Analysis and evaluation of several common information retrieval model[J].Journal of Intelligence ,2008,7(3):121-123(in Chinese) 张小芳.几种常见信息检索模型的分析与评价[J].情报杂志,2008,7(3):121-123 [12] Croft W B,Metzler D,Strohman T.Search Engine:Information Retrieval in Practice[M].Pearson,2010 |
No related articles found! |
|