Abstract
An automatic chart description is a very challenging task. There are many more relationships between objects in a chart compared to general computer vision problems. Furthermore, charts have a different specificity to natural-scene pictures, so commonly used methods do not perform well. To tackle these problems, we propose a process consisting of three sub-tasks: (1) chart classification, (2) detection of a chart’s essential elements, and (3) generation of text description.
Due to the lack of plot datasets dedicated to the task of generating text, we prepared a new dataset – ChaTa+ which contains real-made figures. Additionally, we have adjusted publicly available FigureQA and PlotQA datasets to our particular tasks and tested our method on them. We compared our results with those of the Adobe team [3], which we treated as a benchmark. Finally, we obtained comparable results of the models’ performance, although we trained them on a more complex dataset (semi-synthetic PlotQA) and built a less resource-intensive infrastructure.
Research was funded by the Centre for Priority Research Area Artificial Intelligence and Robotics of Warsaw University of Technology within the Excellence Initiative: Research University (IDUB) programme (grant no 1820/27/Z01/POB2/2021).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bahdanau, D.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)
Behzadian, M., Otaghsara, S., Yazdani, M., Ignatius, J.: A state-of the-art survey of TOPSIS applications. Expert Syst. Appl. 39, 13051–13069 (2012)
Chen, C., Zhang, R., et al.: Figure captioning with relation maps for reasoning. In: WACV (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2015)
Jobin, K.V., Mondal, A., Jawahar, C.V.: Docfigure: a dataset for scientific document figure classification. In: ICDAR (2019)
Kafle, K., Price, B., Cohen, S., Kanan, C.: DVQA: understanding data visualizations via question answering. In: CVPR (2018)
Kahou, S.E., Michalski, V., Atkinson, A., Kadar, A., Trischler, A., Bengio, Y.: FigureQA: an annotated figure dataset for visual reasoning. In: ICLR (2018)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Nitesh, M., Pritha, G., Mitesh, K., Pratyush, K.: PlotQA: reasoning over scientific plots. In: WACV (2020)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Savva, M., Kong, N., Chhajta, A., Fei-Fei, L., Agrawala, M., Heer, J.: ReVision: automated classification, analysis and redesign of chart images. In: ACM (2011)
Siegel, N., Horvitz, Z., Levin, R., Divvala, S., Farhadi, A.: FigureSeer: parsing result-figures in research papers. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 664–680. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_41
Acknowledgements
We would like to thank to Przemysław Biecek and Tomasz Stanisławek for their work on common idea for creating the ChaTa dataset of Charts and Tables along with annotations of their elements, and preliminary ideas of the system to annotate them, and we are grateful for many students from the Faculty of Mathematics and Information Science who contributed to the annotation tool and gathering the preliminary ChaTa dataset, which we modified further.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Seweryn, K., Lorenc, K., Wróblewska, A., Sysko-Romańczuk, S. (2021). What Will You Tell Me About the Chart? – Automated Description of Charts. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-92307-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92306-8
Online ISBN: 978-3-030-92307-5
eBook Packages: Computer ScienceComputer Science (R0)