{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,30]],"date-time":"2024-10-30T19:36:03Z","timestamp":1730316963470,"version":"3.28.0"},"publisher-location":"New York, NY, USA","reference-count":17,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2013,9,10]]},"DOI":"10.1145\/2494266.2494271","type":"proceedings-article","created":{"date-parts":[[2013,9,3]],"date-time":"2013-09-03T11:57:17Z","timestamp":1378209437000},"page":"177-180","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":59,"title":["PDFX"],"prefix":"10.1145","author":[{"given":"Alexandru","family":"Constantin","sequence":"first","affiliation":[{"name":"The University of Manchester, Manchester, United Kingdom"}]},{"given":"Steve","family":"Pettifer","sequence":"additional","affiliation":[{"name":"The University of Manchester, Manchester, United Kingdom"}]},{"given":"Andrei","family":"Voronkov","sequence":"additional","affiliation":[{"name":"The University of Manchester, Manchester, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2013,9,10]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"The arxiv e-print database - http:\/\/arxiv.org. The arxiv e-print database - http:\/\/arxiv.org."},{"key":"e_1_3_2_1_2_1","unstructured":"The poppler pdf library http:\/\/poppler.freedesktop.org\/. The poppler pdf library http:\/\/poppler.freedesktop.org\/."},{"key":"e_1_3_2_1_3_1","unstructured":"The pubmed central archive - http:\/\/www.ncbi.nlm.nih.gov\/pmc\/. The pubmed central archive - http:\/\/www.ncbi.nlm.nih.gov\/pmc\/."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btq383"},{"volume-title":"MSc - The University of Oslo","year":"2011","author":"Berg \u00d8yvind Raddum","key":"e_1_3_2_1_5_1","unstructured":"\u00d8yvind Raddum Berg . High precision text extraction from pdf documents . MSc - The University of Oslo , 2011 . \u00d8yvind Raddum Berg. High precision text extraction from pdf documents. MSc - The University of Oslo, 2011."},{"key":"e_1_3_2_1_6_1","first-page":"98","volume-title":"Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries","author":"Berg \u00d8yvind Raddum","year":"2012","unstructured":"\u00d8yvind Raddum Berg , Stephan Oepen , and Jonathon Read . Towards high-quality text stream extraction from pdf: technical background to the acl 2012 contributed task . In Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries , pages 98 -- 103 . Assoc. for Computational Linguistics , 2012 . \u00d8yvind Raddum Berg, Stephan Oepen, and Jonathon Read. Towards high-quality text stream extraction from pdf: technical background to the acl 2012 contributed task. In Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries, pages 98--103. Assoc. for Computational Linguistics, 2012."},{"volume-title":"Scopus database: a review. Biomedical digital libraries, 3(1):1","year":"2006","author":"Burnham Judy F","key":"e_1_3_2_1_7_1","unstructured":"Judy F Burnham . Scopus database: a review. Biomedical digital libraries, 3(1):1 , 2006 . Judy F Burnham. Scopus database: a review. Biomedical digital libraries, 3(1):1, 2006."},{"key":"e_1_3_2_1_8_1","first-page":"661","volume-title":"Proceedings of LREC","volume":"2008","author":"Isaac G Council","year":"2008","unstructured":"Isaac G Council l, C Lee Giles , and Min-Yen Kan . Parscit : An open-source crf reference string parsing package . In Proceedings of LREC , volume 2008 , pages 661 -- 667 . European Language Resources Association (ELRA) , 2008 . Isaac G Councill, C Lee Giles, and Min-Yen Kan. Parscit: An open-source crf reference string parsing package. In Proceedings of LREC, volume 2008, pages 661--667. European Language Resources Association (ELRA), 2008."},{"key":"e_1_3_2_1_9_1","unstructured":"Herv\u00e9 D\u00e9jean. The pdf2xml project - http:\/\/sourceforge.net\/projects\/pdf2xml\/. Herv\u00e9 D\u00e9jean. The pdf2xml project - http:\/\/sourceforge.net\/projects\/pdf2xml\/."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/11669487_12"},{"key":"e_1_3_2_1_11_1","unstructured":"CrossRef Labs. The pdfextract project - https:\/\/github.com\/CrossRef\/pdfextract. CrossRef Labs. The pdfextract project - https:\/\/github.com\/CrossRef\/pdfextract."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687577"},{"volume-title":"Thuy Dung Nguyen, and Min-Yen Kan. Logical structure recovery in scholarly articles with rich document features","year":"2011","author":"Luong Minh-Thang","key":"e_1_3_2_1_13_1","unstructured":"Minh-Thang Luong , Thuy Dung Nguyen, and Min-Yen Kan. Logical structure recovery in scholarly articles with rich document features . J. of Digital Library Systems . Forthcoming, 2011 . Minh-Thang Luong, Thuy Dung Nguyen, and Min-Yen Kan. Logical structure recovery in scholarly articles with rich document features. J. of Digital Library Systems. Forthcoming, 2011."},{"volume-title":"Layout-aware text extraction from full-text pdf of scientific articles. Source code for biology and medicine, 7(1):1--10","year":"2012","author":"Ramakrishnan C.","key":"e_1_3_2_1_14_1","unstructured":"C. Ramakrishnan , A. Patnia , E. Hovy , Layout-aware text extraction from full-text pdf of scientific articles. Source code for biology and medicine, 7(1):1--10 , 2012 . C. Ramakrishnan, A. Patnia, E. Hovy, et al. Layout-aware text extraction from full-text pdf of scientific articles. Source code for biology and medicine, 7(1):1--10, 2012."},{"volume-title":"Ratcliff and David Metzener. Pattern matching: The gestalt approach. Dr. Dobb's Journal, page 46","year":"1988","author":"John","key":"e_1_3_2_1_15_1","unstructured":"John W. Ratcliff and David Metzener. Pattern matching: The gestalt approach. Dr. Dobb's Journal, page 46 , 1988 . John W. Ratcliff and David Metzener. Pattern matching: The gestalt approach. Dr. Dobb's Journal, page 46, 1988."},{"key":"e_1_3_2_1_16_1","unstructured":"Matthew Talbert. Mobipocket.com pdf2xml - https:\/\/launchpad.net\/pdf2xml. Matthew Talbert. Mobipocket.com pdf2xml - https:\/\/launchpad.net\/pdf2xml."},{"key":"e_1_3_2_1_17_1","unstructured":"Lu Wang. The pdf2htmlex project - http:\/\/coolwanglu.github.io\/pdf2htmlEX\/. Lu Wang. The pdf2htmlex project - http:\/\/coolwanglu.github.io\/pdf2htmlEX\/."}],"event":{"name":"DocEng '13: ACM Symposium on Document Engineering 2013","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"],"location":"Florence Italy","acronym":"DocEng '13"},"container-title":["Proceedings of the 2013 ACM symposium on Document engineering"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2494266.2494271","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,8]],"date-time":"2023-01-08T19:50:02Z","timestamp":1673207402000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2494266.2494271"}},"subtitle":["fully-automated PDF-to-XML conversion of scientific literature"],"short-title":[],"issued":{"date-parts":[[2013,9,10]]},"references-count":17,"alternative-id":["10.1145\/2494266.2494271","10.1145\/2494266"],"URL":"https:\/\/doi.org\/10.1145\/2494266.2494271","relation":{},"subject":[],"published":{"date-parts":[[2013,9,10]]},"assertion":[{"value":"2013-09-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}