{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T23:29:28Z","timestamp":1740180568794,"version":"3.37.3"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,4,1]],"date-time":"2021-04-01T00:00:00Z","timestamp":1617235200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,4,1]],"date-time":"2021-04-01T00:00:00Z","timestamp":1617235200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecur"],"published-print":{"date-parts":[[2021,12]]},"abstract":"Abstract<\/jats:title>Reading text in images automatically has become an attractive research topic in computer vision. Specifically, end-to-end spotting of scene text has attracted significant research attention, and relatively ideal accuracy has been achieved on several datasets. However, most of the existing works overlooked the semantic connection between the scene text instances, and had limitations in situations such as occlusion, blurring, and unseen characters, which result in some semantic information lost in the text regions. The relevance between texts generally lies in the scene images. From the perspective of cognitive psychology, humans often combine the nearby easy-to-recognize texts to infer the unidentifiable text. In this paper, we propose a novel graph-based method for intermediate semantic features enhancement, called Text Relation Networks<\/jats:italic>. Specifically, we model the co-occurrence relationship of scene texts as a graph. The nodes in the graph represent the text instances in a scene image, and the corresponding semantic features are defined as representations of the nodes. The relative positions between text instances are measured as the weights of edges in the established graph. Then, a convolution operation is performed on the graph to aggregate semantic information and enhance the intermediate features corresponding to text instances. We evaluate the proposed method through comprehensive experiments on several mainstream benchmarks, and get highly competitive results. For example, on the , our method surpasses the previous top works by 2.1% on the word spotting task.<\/jats:p>","DOI":"10.1186\/s42400-021-00073-x","type":"journal-article","created":{"date-parts":[[2021,3,31]],"date-time":"2021-03-31T23:04:29Z","timestamp":1617231869000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["An end-to-end text spotter with text relation networks"],"prefix":"10.1186","volume":"4","author":[{"given":"Jianguo","family":"Jiang","sequence":"first","affiliation":[]},{"given":"Baole","family":"Wei","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4371-7864","authenticated-orcid":false,"given":"Min","family":"Yu","sequence":"additional","affiliation":[]},{"given":"Gang","family":"Li","sequence":"additional","affiliation":[]},{"given":"Boquan","family":"Li","sequence":"additional","affiliation":[]},{"given":"Chao","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Min","family":"Li","sequence":"additional","affiliation":[]},{"given":"Weiqing","family":"Huang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,4,1]]},"reference":[{"issue":"4","key":"73_CR1","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1109\/MSP.2017.2693418","volume":"34","author":"MM Bronstein","year":"2017","unstructured":"Bronstein, MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Process Mag 34(4):18\u201342.","journal-title":"IEEE Signal Process Mag"},{"key":"73_CR2","doi-asserted-by":"publisher","first-page":"935","DOI":"10.1109\/ICDAR.2017.157","volume-title":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1","author":"CK Ch\u2019ng","year":"2017","unstructured":"Ch\u2019ng, CK, Chan CS (2017) Total-text: A comprehensive dataset for scene text detection and recognition In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, 935\u2013942.. IEEE, Los Alamitos."},{"key":"73_CR3","doi-asserted-by":"crossref","unstructured":"Cho, K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. Comput ence abs\/1406.107. http:\/\/arxiv.org\/abs\/1406.1078.","DOI":"10.3115\/v1\/D14-1179"},{"key":"73_CR4","first-page":"3844","volume-title":"Advances in Neural Information Processing Systems","author":"M Defferrard","year":"2016","unstructured":"Defferrard, M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering In: Advances in Neural Information Processing Systems, 3844\u20133852.. Curran Associates Inc., Red Hook."},{"key":"73_CR5","doi-asserted-by":"publisher","first-page":"248","DOI":"10.1109\/CVPR.2009.5206848","volume-title":"2009 IEEE Conference on Computer Vision and Pattern Recognition","author":"J Deng","year":"2009","unstructured":"Deng, J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248\u2013255.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR6","first-page":"9076","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"W Feng","year":"2019","unstructured":"Feng, W, He W, Yin F, Zhang X-Y, Liu C-L (2019) Textdragon: An end-to-end framework for arbitrary shaped text spotting In: Proceedings of the IEEE International Conference on Computer Vision, 9076\u20139085.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR7","doi-asserted-by":"crossref","unstructured":"Gupta, A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2315\u20132324.","DOI":"10.1109\/CVPR.2016.254"},{"key":"73_CR8","first-page":"2961","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"K He","year":"2017","unstructured":"He, K, Gkioxari G, Doll\u00e1r P, Girshick R (2017) Mask r-cnn In: Proceedings of the IEEE International Conference on Computer Vision, 2961\u20132969.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR9","first-page":"5020","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"T He","year":"2018","unstructured":"He, T, Tian Z, Huang W, Shen C, Qiao Y, Sun C (2018) An end-to-end textspotter with explicit alignment and attention In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5020\u20135029.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR10","first-page":"770","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"K He","year":"2016","unstructured":"He, K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770\u2013778.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR11","doi-asserted-by":"publisher","first-page":"1156","DOI":"10.1109\/ICDAR.2015.7333942","volume-title":"2015 13th International Conference on Document Analysis and Recognition (ICDAR)","author":"D Karatzas","year":"2015","unstructured":"Karatzas, D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar VR, Lu S, et al (2015) Icdar 2015 competition on robust reading In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 1156\u20131160.. IEEE, Los Alamitos."},{"key":"73_CR12","unstructured":"Kipf, TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907."},{"issue":"8","key":"73_CR13","doi-asserted-by":"publisher","first-page":"3676","DOI":"10.1109\/TIP.2018.2825107","volume":"27","author":"M Liao","year":"2018","unstructured":"Liao, M, Shi B, Bai X (2018) Textboxes++: A single-shot oriented scene text detector. IEEE Trans Image Process 27(8):3676\u20133690.","journal-title":"IEEE Trans Image Process"},{"key":"73_CR14","volume-title":"Thirty-First AAAI Conference on Artificial Intelligence","author":"M Liao","year":"2017","unstructured":"Liao, M, Shi B, Bai X, Wang X, Liu W (2017) Textboxes: A fast text detector with a single deep neural network In: Thirty-First AAAI Conference on Artificial Intelligence.. IEEE, New York City."},{"key":"73_CR15","first-page":"2117","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"T-Y Lin","year":"2017","unstructured":"Lin, T-Y, Doll\u00e1r P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2117\u20132125.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR16","doi-asserted-by":"publisher","first-page":"337","DOI":"10.1016\/j.patcog.2019.02.002","volume":"90","author":"Y Liu","year":"2019","unstructured":"Liu, Y, Jin L, Zhang S, Luo C, Zhang S (2019) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337\u2013345. https:\/\/doi.org\/10.1016\/j.patcog.2019.02.002.","journal-title":"Pattern Recognit"},{"key":"73_CR17","doi-asserted-by":"publisher","first-page":"337","DOI":"10.1016\/j.patcog.2019.02.002","volume":"90","author":"Y Liu","year":"2019","unstructured":"Liu, Y, Jin L, Zhang S, Luo C, Zhang S (2019) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recog 90:337\u2013345.","journal-title":"Pattern Recog"},{"key":"73_CR18","first-page":"5676","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"X Liu","year":"2018","unstructured":"Liu, X, Liang D, Yan S, Chen D, Qiao Y, Yan J (2018) Fots: Fast oriented text spotting with a unified network In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5676\u20135685.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR19","doi-asserted-by":"publisher","unstructured":"Luong, T, Pham H, Manning CD (2015) Effective Approaches to Attention-based Neural Machine Translation. In: M\u00e0rquez L, Callison-Burch C, Su J, Pighin D, Marton Y (eds)Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, 1412\u20131421.. The Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/d15-1166.","DOI":"10.18653\/v1\/d15-1166"},{"key":"73_CR20","doi-asserted-by":"crossref","unstructured":"Lyu, P, Liao M, Yao C, Wu W, Bai X (2018) Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes In: Proceedings of the European Conference on Computer Vision (ECCV), 67\u201383.","DOI":"10.1007\/978-3-030-01264-9_5"},{"key":"73_CR21","first-page":"5115","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"F Monti","year":"2017","unstructured":"Monti, F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein MM (2017) Geometric deep learning on graphs and manifolds using mixture model cnns In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5115\u20135124.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR22","first-page":"401","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV)","author":"S Qi","year":"2018","unstructured":"Qi, S, Wang W, Jia B, Shen J, Zhu S-C (2018) Learning human-object interactions by graph parsing neural networks In: Proceedings of the European Conference on Computer Vision (ECCV), 401\u2013417.. Springer, Cham."},{"issue":"7","key":"73_CR23","first-page":"11899","volume":"34","author":"L Qiao","year":"2020","unstructured":"Qiao, L, Tang S, Cheng Z, Xu Y, Wu F (2020) Text perceptron: Towards end-to-end arbitrary-shaped text spotting. Proc AAAI Conf Artif Intell 34(7):11899\u201311907.","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"73_CR24","first-page":"4704","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"S Qin","year":"2019","unstructured":"Qin, S, Bissacco A, Raptis M, Fujii Y, Xiao Y (2019) Towards unconstrained end-to-end text spotting In: Proceedings of the IEEE International Conference on Computer Vision, 4704\u20134714.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR25","first-page":"91","volume-title":"Advances in Neural Information Processing Systems","author":"S Ren","year":"2015","unstructured":"Ren, S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks In: Advances in Neural Information Processing Systems, 91\u201399.. Curran Associates Inc., Red Hook."},{"issue":"11","key":"73_CR26","doi-asserted-by":"publisher","first-page":"2298","DOI":"10.1109\/TPAMI.2016.2646371","volume":"39","author":"B Shi","year":"2016","unstructured":"Shi, B, Bai X, Yao C (2016) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298\u20132304.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"9","key":"73_CR27","doi-asserted-by":"publisher","first-page":"2035","DOI":"10.1109\/TPAMI.2018.2848939","volume":"41","author":"B Shi","year":"2018","unstructured":"Shi, B, Yang M, Wang X, Lyu P, Yao C, Bai X (2018) Aster: An attentional scene text recognizer with flexible rectification. IEEE Trans Pattern Anal Mach Intell 41(9):2035\u20132048.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"73_CR28","first-page":"83","volume-title":"Asian Conference on Computer Vision","author":"Y Sun","year":"2018","unstructured":"Sun, Y, Zhang C, Huang Z, Liu J, Han J, Ding E (2018) Textnet: Irregular text reading from images with an end-to-end trainable network In: Asian Conference on Computer Vision, 83\u201399.. Springer, Cham."},{"key":"73_CR29","first-page":"1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"D Teney","year":"2017","unstructured":"Teney, D, Liu L, van Den Hengel A (2017) Graph-structured representations for visual question answering In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1\u20139.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR30","unstructured":"Veli\u010dkovi\u0107, P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903."},{"issue":"7","key":"73_CR31","first-page":"12160","volume":"34","author":"H Wang","year":"2020","unstructured":"Wang, H, Lu P, Zhang H, Yang M, Liu W (2020) All you need is boundary: Toward arbitrary-shaped text spotting. Proc AAAI Conf Artif Intell 34(7):12160\u201312167.","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"73_CR32","first-page":"9126","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"L Xing","year":"2019","unstructured":"Xing, L, Tian Z, Huang W, Scott MR (2019) Convolutional character networks In: Proceedings of the IEEE International Conference on Computer Vision, 9126\u20139136.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR33","first-page":"9298","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"H Xu","year":"2019","unstructured":"Xu, H, Jiang C, Liang X, Li Z (2019) Spatial-aware graph relation network for large-scale object detection In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 9298\u20139307.. IEEE Computer Society, Los Alamitos."},{"key":"73_CR34","first-page":"5551","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"X Zhou","year":"2017","unstructured":"Zhou, X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5551\u20135560.. IEEE Computer Society, Los Alamitos."}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-021-00073-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s42400-021-00073-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-021-00073-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,31]],"date-time":"2021-03-31T23:06:12Z","timestamp":1617231972000},"score":1,"resource":{"primary":{"URL":"https:\/\/cybersecurity.springeropen.com\/articles\/10.1186\/s42400-021-00073-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,1]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["73"],"URL":"https:\/\/doi.org\/10.1186\/s42400-021-00073-x","relation":{},"ISSN":["2523-3246"],"issn-type":[{"type":"electronic","value":"2523-3246"}],"subject":[],"published":{"date-parts":[[2021,4,1]]},"assertion":[{"value":"2 November 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 January 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 April 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare that they have no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"7"}}