{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T16:21:49Z","timestamp":1726762909458},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"Key information extraction (KIE) from document images requires understanding the contextual and spatial semantics of texts in two-dimensional (2D) space.\nMany recent studies try to solve the task by developing pre-trained language models focusing on combining visual features from document images with texts and their layout.\nOn the other hand, this paper tackles the problem by going back to the basic: effective combination of text and layout. \nSpecifically, we propose a pre-trained language model, named BROS (BERT Relying On Spatiality), that encodes relative positions of texts in 2D space and learns from unlabeled documents with area-masking strategy.\nWith this optimized training scheme for understanding texts in 2D space, BROS shows comparable or better performance compared to previous methods on four KIE benchmarks (FUNSD, SROIE*, CORD, and SciTSR) without relying on visual features.\nThis paper also reveals two real-world challenges in KIE tasks--(1) minimizing the error from incorrect text ordering and (2) efficient learning from fewer downstream examples--and demonstrates the superiority of BROS over previous methods.<\/jats:p>","DOI":"10.1609\/aaai.v36i10.21322","type":"journal-article","created":{"date-parts":[[2022,7,4]],"date-time":"2022-07-04T11:38:38Z","timestamp":1656934718000},"page":"10767-10775","source":"Crossref","is-referenced-by-count":63,"title":["BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents"],"prefix":"10.1609","volume":"36","author":[{"given":"Teakgyu","family":"Hong","sequence":"first","affiliation":[]},{"given":"DongHyun","family":"Kim","sequence":"additional","affiliation":[]},{"given":"Mingi","family":"Ji","sequence":"additional","affiliation":[]},{"given":"Wonseok","family":"Hwang","sequence":"additional","affiliation":[]},{"given":"Daehyun","family":"Nam","sequence":"additional","affiliation":[]},{"given":"Sungrae","family":"Park","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2022,6,28]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/21322\/21071","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/21322\/21071","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,4]],"date-time":"2022-07-04T11:38:38Z","timestamp":1656934718000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/21322"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,28]]},"references-count":0,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2022,6,30]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v36i10.21322","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2022,6,28]]}}}