{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,1,29]],"date-time":"2023-01-29T15:30:57Z","timestamp":1675006257664},"reference-count":24,"publisher":"IGI Global","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,7]]},"abstract":"With the rapid growth of digital information and user need, it becomes imperative to retrieve relevant and desired domain or topic specific documents as per the user query quickly. A focused crawler plays a vital role in digital libraries to crawl the web so that researchers can easily explore the domain specific search results list and find the desired content against the query. In this article, a focused crawler is being proposed for online digital library search engines, which considers meta-data of the query in order to retrieve the corresponding document or other relevant but missing information (e.g. paid publication from ACM, IEEE, etc.) against the user query. The different query strategies are made by using the meta-data and submitted to different search engines which aim to find more relevant information which is missing. The result comes out from these search engines are filtered and then used further for crawling the Web.<\/jats:p>","DOI":"10.4018\/ijirr.2019070103","type":"journal-article","created":{"date-parts":[[2019,5,28]],"date-time":"2019-05-28T10:53:44Z","timestamp":1559040824000},"page":"23-47","source":"Crossref","is-referenced-by-count":4,"title":["An Approach for Focused Crawler to Harvest Digital Academic Documents in Online Digital Libraries"],"prefix":"10.4018","volume":"9","author":[{"given":"Sumita","family":"Gupta","sequence":"first","affiliation":[{"name":"YMCAUST, Faridabad, India"}]},{"given":"Neelam","family":"Duhan","sequence":"additional","affiliation":[{"name":"J C Bose University of Science and Technology YMCA, Faridabad, India"}]},{"given":"Poonam","family":"Bansal","sequence":"additional","affiliation":[{"name":"MSIT, Delhi, India"}]}],"member":"2432","reference":[{"key":"IJIRR.2019070103-0","doi-asserted-by":"publisher","DOI":"10.1145\/1378889.1378891"},{"key":"IJIRR.2019070103-1","article-title":"Efficient crawling through URL ordering.","author":"J.Cho","year":"1998","journal-title":"Proceedings of the Seventh World-Wide Web Conference"},{"key":"IJIRR.2019070103-2","doi-asserted-by":"publisher","DOI":"10.1016\/S1389-1286(99)00052-3"},{"key":"IJIRR.2019070103-3","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45747-X_7"},{"key":"IJIRR.2019070103-4","doi-asserted-by":"publisher","DOI":"10.5120\/966-1343"},{"key":"IJIRR.2019070103-5","doi-asserted-by":"publisher","DOI":"10.1145\/1998076.1998099"},{"key":"IJIRR.2019070103-6","unstructured":"Gollapalli, S. D., Mitra, P., & Giles, C. L. (2011). Learning to Rank Homepages For Researcher-Name Queries. In EOS, SIGIR 2011 Workshop, Beijing, China (pp. 53-58)."},{"key":"IJIRR.2019070103-7","unstructured":"Gollapalli, S. D., Patel, K., & Caragea, C. (2016). A Search\/Crawl Framework for Automatically Acquiring Scientific Documents. arXiv:1604.05005"},{"key":"IJIRR.2019070103-8","doi-asserted-by":"publisher","DOI":"10.1007\/11871637_53"},{"key":"IJIRR.2019070103-9","doi-asserted-by":"crossref","unstructured":"Hati, D., &Kumar, A., &Mishra, L. (2010). Unvisited URL Relevancy Calculation in Focused Crawling Based on Na\u00efve Bayesian Classification. International Journal of Computer Applications, 3(9).","DOI":"10.5120\/767-1074"},{"key":"IJIRR.2019070103-10","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1145\/501516.501556","article-title":"Finding scientific papers with homepage search and MOPS.","author":"G.Hoff","year":"2001","journal-title":"Proceedings of the 19th annual international conference on Computer documentation"},{"key":"IJIRR.2019070103-11","doi-asserted-by":"publisher","DOI":"10.1145\/996350.996357"},{"key":"IJIRR.2019070103-12","unstructured":"Li, H., Councill, I. G., Bolelli, L., Zhou, D., Song, Y., Lee, W. C. \u2026 Giles, C. L. (2006). CiteSeer X- A Scalable Autonomous Scientific Digital Library. In Proceedings of the 1st international conference on Scalable information systems, Hong Kong."},{"key":"IJIRR.2019070103-13","doi-asserted-by":"publisher","DOI":"10.1108\/14666180010734030"},{"key":"IJIRR.2019070103-14","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30544-6_44"},{"issue":"1","key":"IJIRR.2019070103-15","article-title":"Effective Focused Crawling Based on Content and Link Structure Analysis.","volume":"2","author":"A.Pal","year":"2009","journal-title":"International Journal of Computer Science and Information Security"},{"issue":"2","key":"IJIRR.2019070103-16","first-page":"26","article-title":"Focused Crawling for Educational Materials from the Web.","volume":"1","author":"K.Premlatha","year":"2011","journal-title":"International Journal of Computer Science & Informatics"},{"key":"IJIRR.2019070103-17","doi-asserted-by":"publisher","DOI":"10.1145\/996350.996383"},{"key":"IJIRR.2019070103-18","first-page":"5","article-title":"A Component-Based Digital Library Service for Finding Missing Documents.","author":"R. L. T.Santos","year":"2007","journal-title":"Proceedings of the Brazilian Symposium on Databases"},{"key":"IJIRR.2019070103-19","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2008.12.006"},{"key":"IJIRR.2019070103-20","doi-asserted-by":"publisher","DOI":"10.1109\/IITSI.2010.30"},{"key":"IJIRR.2019070103-21","doi-asserted-by":"crossref","unstructured":"Wu, J., Teregowda, P., Ram, J. P. F., Mitra, P., Zheng, S., & Giles, C. L. (2012). The Evolution of a Crawling Strategy for an Academic Document Search Engine: Whitelists and Blacklists. In WebSci 2012, Evanston, IL, June 22\u201324.","DOI":"10.1145\/2380718.2380762"},{"key":"IJIRR.2019070103-22","doi-asserted-by":"crossref","unstructured":"Zhuang, Z., Wagle, R., & Giles, C. L. (2005). What\u2019s There and What\u2019s Not? Focused Crawling for Missing Documents in Digital Libraries. In Proceedings of fifth ACM\/IEEE \u2013CS joint conference on digital libraries, Denver, CO (pp. 301-310).","DOI":"10.1145\/1065385.1065455"},{"key":"IJIRR.2019070103-23","doi-asserted-by":"publisher","DOI":"10.1109\/ICEBE.2008.61"}],"container-title":["International Journal of Information Retrieval Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=230325","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,6]],"date-time":"2022-05-06T00:14:57Z","timestamp":1651796097000},"score":1,"resource":{"primary":{"URL":"http:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/IJIRR.2019070103"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2019,7]]},"references-count":24,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.4018\/ijirr.2019070103","relation":{},"ISSN":["2155-6377","2155-6385"],"issn-type":[{"value":"2155-6377","type":"print"},{"value":"2155-6385","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,7]]}}}