{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,4,2]],"date-time":"2022-04-02T16:02:40Z","timestamp":1648915360948},"reference-count":24,"publisher":"World Scientific Pub Co Pte Ltd","issue":"11n12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Soft. Eng. Knowl. Eng."],"published-print":{"date-parts":[[2021,12]]},"abstract":" Formula retrieval is an important research topic in Mathematical Information Retrieval\u00a0(MIR). Most studies have focused on formula comparison to determine the similarity between mathematical documents. However, two similar formulae may appear in entirely different knowledge domains and have different meanings. Based on N-ary Tree-based Formula Embedding Model\u00a0(NTFEM, our previous work in [Y. Dai, L. Chen, and Z. Zhang, An N-ary tree-based model for similarity evaluation on mathematical formulae, in Proc. 2020 IEEE Int. Conf. Systems, Man, and Cybernetics, 2020, pp. 2578\u20132584.], we introduce a new hybrid retrieval model, NTFEM-K, which combines formulae with their surrounding keywords for more accurate retrieval. By using keywords extraction technology, we extract keywords from context, which can supplement the semantic information of the formula. Then, we get the vector representations of keywords by FastText N-gram embedding model and the vector representations of formulae by NTFEM. Finally, documents are sorted according to the similarity between keywords, and then the ranking results are optimized by formula similarity. For performance evaluation, NTFEM-K is not only compared with NTFEM but also hybrid retrieval models combining formulae with long text and hybrid retrieval models combining formulae with their keywords using other keyword extraction algorithms. Experimental results show that the accuracy of top-10 results of NTFEM-K is at least 20% higher than that of NTFEM and can be 50% in some specific topics. <\/jats:p>","DOI":"10.1142\/s0218194021400131","type":"journal-article","created":{"date-parts":[[2022,1,25]],"date-time":"2022-01-25T09:01:56Z","timestamp":1643101316000},"page":"1583-1602","source":"Crossref","is-referenced-by-count":0,"title":["A Hybrid Model Combining Formulae with Keywords for Mathematical Information Retrieval"],"prefix":"10.1142","volume":"31","author":[{"given":"Yuqi","family":"Shen","sequence":"first","affiliation":[{"name":"Engineering Research Center of Software\/Hardware Co-design Technology and Application, East China Normal University, Shanghai, P. R. China"},{"name":"Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, P. R. China"}]},{"given":"Cheng","family":"Chen","sequence":"additional","affiliation":[{"name":"Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, P. R. China"}]},{"given":"Yifan","family":"Dai","sequence":"additional","affiliation":[{"name":"Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, P. R. China"}]},{"given":"Jinfang","family":"Cai","sequence":"additional","affiliation":[{"name":"Institute of Vocational & Adult Education, East China Normal University, Shanghai, P. R. China"}]},{"given":"Liangyu","family":"Chen","sequence":"additional","affiliation":[{"name":"Engineering Research Center of Software\/Hardware Co-design Technology and Application, East China Normal University, Shanghai, P. R. China"},{"name":"Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, P. R. China"}]}],"member":"219","published-online":{"date-parts":[[2022,1,24]]},"reference":[{"key":"S0218194021400131BIB001","doi-asserted-by":"publisher","DOI":"10.1109\/SMC42975.2020.9283495"},{"key":"S0218194021400131BIB003","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33017394"},{"key":"S0218194021400131BIB004","doi-asserted-by":"publisher","DOI":"10.1002\/9780470689646.ch1"},{"key":"S0218194021400131BIB005","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00051"},{"key":"S0218194021400131BIB006","first-page":"1188","volume-title":"Proc. Int. Conf. Machine Learning","author":"Le Q.","year":"2014"},{"key":"S0218194021400131BIB008","first-page":"338","volume-title":"Project of NII Testbeds and Community for Information Access Research","author":"Davila K.","year":"2016"},{"key":"S0218194021400131BIB009","first-page":"1","volume-title":"Proc. 5th Int. Conf. Learning Representations","author":"Arora S.","year":"2017"},{"key":"S0218194021400131BIB010","first-page":"3111","volume-title":"Proc. Advances in Neural Information Processing Systems","author":"Mikolov T.","year":"2013"},{"key":"S0218194021400131BIB012","first-page":"7","volume-title":"Proc. Document Recognition and Retrieval XXII","author":"Stalnaker D.","year":"2015"},{"key":"S0218194021400131BIB013","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-73086-6_27"},{"key":"S0218194021400131BIB014","first-page":"127","volume-title":"Proc. 11th NTCIR Conf. Evaluation of Information Access Technologies","author":"Ru\u017eicka M.","year":"2014"},{"key":"S0218194021400131BIB015","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022967814992"},{"key":"S0218194021400131BIB016","first-page":"23","volume-title":"Int. Workshop on Multi-Disciplinary Trends in Artificial Intelligence","author":"Kumar P.","year":"2012"},{"key":"S0218194021400131BIB018","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2911512"},{"key":"S0218194021400131BIB019","first-page":"346","volume-title":"Project of NII Testbeds and Community for Information Access Research","author":"Thanda A.","year":"2016"},{"key":"S0218194021400131BIB020","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"S0218194021400131BIB021","first-page":"1130","volume-title":"Proc. Int. Joint Conf. Artificial Intelligence","author":"Soucy P.","year":"2005"},{"key":"S0218194021400131BIB022","first-page":"1367","volume-title":"Proc. 2016 Conf. North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Hill F.","year":"2016"},{"key":"S0218194021400131BIB023","doi-asserted-by":"publisher","DOI":"10.4018\/978-1-59140-441-5.ch008"},{"key":"S0218194021400131BIB024","first-page":"434","volume-title":"Proc. 3th Int. Joint Conf. Artificial Intelligence","author":"Turney P. D.","year":"2003"},{"key":"S0218194021400131BIB025","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-77094-7_41"},{"key":"S0218194021400131BIB026","doi-asserted-by":"publisher","DOI":"10.3115\/1699648.1699678"},{"key":"S0218194021400131BIB027","first-page":"404","volume-title":"Proc. 2004 Conf. Empirical Methods in Natural Language Processing","author":"Mihalcea R.","year":"2004"},{"key":"S0218194021400131BIB029","doi-asserted-by":"publisher","DOI":"10.1016\/S0031-3203(00)00162-X"}],"container-title":["International Journal of Software Engineering and Knowledge Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218194021400131","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,25]],"date-time":"2022-01-25T09:02:20Z","timestamp":1643101340000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0218194021400131"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12]]},"references-count":24,"journal-issue":{"issue":"11n12","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["10.1142\/S0218194021400131"],"URL":"https:\/\/doi.org\/10.1142\/s0218194021400131","relation":{},"ISSN":["0218-1940","1793-6403"],"issn-type":[{"value":"0218-1940","type":"print"},{"value":"1793-6403","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12]]}}}