{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,9]],"date-time":"2023-10-09T10:54:08Z","timestamp":1696848848547},"reference-count":39,"publisher":"Wiley","issue":"7","license":[{"start":{"date-parts":[[2013,5,22]],"date-time":"2013-05-22T00:00:00Z","timestamp":1369180800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J Am Soc Inf Sci Tec"],"published-print":{"date-parts":[[2013,7]]},"abstract":"Relationships between terms and features are an essential component of thesauri, ontologies, and a range of controlled vocabularies. In this article, we describe ways to identify important concepts in documents using the relationships in a thesaurus or other vocabulary structures. We introduce a methodology for the analysis and modeling of the indexing process based on a weighted random walk algorithm. The primary goal of this research is the analysis of the contribution of thesaurus structure to the indexing process. The resulting models are evaluated in the context of automatic subject indexing using four collections of documents pre\u2010indexed with 4 different thesauri (AGROVOC<\/jats:styled-content> [UN Food and Agriculture Organization], high\u2010energy physics taxonomy [HEP<\/jats:styled-content>], National Agricultural Library Thesaurus [NALT<\/jats:styled-content>], and medical subject headings [MeSH<\/jats:styled-content>]). We also introduce a thesaurus\u2010centric matching algorithm intended to improve the quality of candidate concepts. In all cases, the weighted random walk improves automatic indexing performance over matching alone with an increase in average precision (AP<\/jats:styled-content>) of 9% for HEP<\/jats:styled-content>, 11% for MeSH<\/jats:styled-content>, 35% for NALT<\/jats:styled-content>, and 37% for AGROVOC<\/jats:styled-content>. The results of the analysis support our hypothesis that subject indexing is in part a browsing process, and that using the vocabulary and its structure in a thesaurus contributes to the indexing process. The amount that the vocabulary structure contributes was found to differ among the 4 thesauri, possibly due to the vocabulary used in the corresponding thesauri and the structural relationships between the terms. Each of the thesauri and the manual indexing associated with it is characterized using the methods developed here.<\/jats:p>","DOI":"10.1002\/asi.22853","type":"journal-article","created":{"date-parts":[[2013,5,22]],"date-time":"2013-05-22T15:24:52Z","timestamp":1369236292000},"page":"1330-1344","source":"Crossref","is-referenced-by-count":9,"title":["A random walk on an ontology: Using thesaurus structure for automatic subject indexing"],"prefix":"10.1002","volume":"64","author":[{"given":"Craig","family":"Willis","sequence":"first","affiliation":[{"name":"Graduate School of Library and Information Science University of Illinois at Urbana\u2010Champaign 501 E. Daniel Street Champaign IL 61820"}]},{"given":"Robert M.","family":"Losee","sequence":"additional","affiliation":[{"name":"School of Information and Library Science University of North Carolina 216 Lenoir Drive, 302 Manning Hall Chapel Hill NC"}]}],"member":"311","published-online":{"date-parts":[[2013,5,22]]},"reference":[{"issue":"1","key":"e_1_2_11_2_1","first-page":"368","article-title":"The NLM Indexing initiative's medical text indexer","volume":"11","author":"Aronson A.R.","year":"2004","journal-title":"Medinfo"},{"key":"e_1_2_11_3_1","volume-title":"Random walks in biology","author":"Berg H.C.","year":"1993"},{"key":"e_1_2_11_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1460833.1460865"},{"key":"e_1_2_11_5_1","first-page":"107","volume-title":"Seventh International World\u2010Wide Web Conference (WWW 1998)","author":"Brin S.","year":"1998"},{"key":"e_1_2_11_6_1","doi-asserted-by":"publisher","DOI":"10.1080\/13658810701626251"},{"key":"e_1_2_11_7_1","volume-title":"Theory of subject analysis: A sourcebook","author":"Chan L.M.","year":"1985"},{"key":"e_1_2_11_8_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb026605"},{"key":"e_1_2_11_9_1","volume-title":"The subject approach to information","author":"Foskett A.C.","year":"1996"},{"key":"e_1_2_11_10_1","volume-title":"Fundamentals of probability with stochastic processes","author":"Ghahramani S.","year":"2005"},{"key":"e_1_2_11_11_1","doi-asserted-by":"publisher","DOI":"10.1002\/bult.2011.1720370407"},{"issue":"8","key":"e_1_2_11_12_1","first-page":"22","article-title":"Automatic indexing","volume":"9","author":"Hlava M.M.K.","year":"2005","journal-title":"Information Outlook"},{"key":"e_1_2_11_13_1","volume-title":"AGRICOLA\u2013guide to subject indexing","author":"Hood M.W.","year":"1990"},{"key":"e_1_2_11_14_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(1999)50:8<661::AID-ASI4>3.0.CO;2-R"},{"key":"e_1_2_11_15_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(198705)38:3<184::AID-ASI7>3.0.CO;2-F"},{"key":"e_1_2_11_16_1","volume-title":"ISO 5963\u2013Documentation\u2013Methods for examining documents, determining their subjects, and selecting indexing terms","author":"International Organization for Standardization","year":"1985"},{"key":"e_1_2_11_17_1","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199234868.001.0001"},{"key":"e_1_2_11_18_1","doi-asserted-by":"crossref","DOI":"10.21236\/AD0696200","volume-title":"Machine\u2010aided indexing (Technical Report DDC\u2010TR\u201069\u20101)","author":"Klingbiel P.H.","year":"1969"},{"key":"e_1_2_11_19_1","volume-title":"Indexing and abstracting in theory and practice","author":"Lancaster F.W.","year":"2003"},{"key":"e_1_2_11_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2047296.2047298"},{"key":"e_1_2_11_21_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199701)48:1<55::AID-ASI7>3.0.CO;2-0"},{"issue":"4","key":"e_1_2_11_22_1","first-page":"245","article-title":"A performance model of the length and number of subject headings and index phrases","volume":"31","author":"Losee R.M.","year":"2004","journal-title":"Knowledge Organization"},{"key":"e_1_2_11_23_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2005.11.002"},{"key":"e_1_2_11_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2006.08.011"},{"key":"e_1_2_11_25_1","volume-title":"A random walk down Wall Street","author":"Malkiel B.G.","year":"1999"},{"key":"e_1_2_11_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/321075.321084"},{"key":"e_1_2_11_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2005.6"},{"key":"e_1_2_11_28_1","volume-title":"Human\u2010competitive automatic topic indexing","author":"Medelyan O.","year":"2009"},{"key":"e_1_2_11_29_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.20790"},{"key":"e_1_2_11_30_1","first-page":"233","volume-title":"Proceedings of the ACM Conference on Information and Knowledge Management","author":"Mihalcea R.","year":"2007"},{"key":"e_1_2_11_31_1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511976247"},{"key":"e_1_2_11_32_1","first-page":"404","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004)","author":"Mihalcea R.","year":"2004"},{"key":"e_1_2_11_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2008.12.007"},{"key":"e_1_2_11_34_1","doi-asserted-by":"publisher","DOI":"10.1080\/13658810701626236"},{"key":"e_1_2_11_35_1","doi-asserted-by":"publisher","DOI":"10.1038\/072294b0"},{"key":"e_1_2_11_36_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199808)49:10<888::AID-ASI5>3.0.CO;2-Z"},{"key":"e_1_2_11_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/505282.505283"},{"key":"e_1_2_11_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(94)90075-2"},{"key":"e_1_2_11_39_1","first-page":"563","volume-title":"Proceedings of the AFIPS \u201864","author":"Stevens M.E.","year":"1964"},{"key":"e_1_2_11_40_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(198707)38:4<269::AID-ASI8>3.0.CO;2-S"}],"container-title":["Journal of the American Society for Information Science and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fasi.22853","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/asi.22853","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,6]],"date-time":"2023-10-06T12:45:28Z","timestamp":1696596328000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/asi.22853"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,5,22]]},"references-count":39,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2013,7]]}},"alternative-id":["10.1002\/asi.22853"],"URL":"https:\/\/doi.org\/10.1002\/asi.22853","archive":["Portico"],"relation":{},"ISSN":["1532-2882","1532-2890"],"issn-type":[{"value":"1532-2882","type":"print"},{"value":"1532-2890","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,5,22]]}}}