Copy Mechanism and Tailored Training for Character-Based Data-to-Text Generation

Roberti, Marco; Bonetta, Giovanni; Cancelliere, Rossella; Gallinari, Patrick

doi:10.1007/978-3-030-46147-8_39

Marco Roberti¹⁴,
Giovanni Bonetta¹⁴,
Rossella Cancelliere¹⁴ &
…
Patrick Gallinari^15,16

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11907))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1566 Accesses
2 Citations

Abstract

In the last few years, many different methods have been focusing on using deep recurrent neural networks for natural language generation. The most widely used sequence-to-sequence neural methods are word-based: as such, they need a pre-processing step called delexicalization (conversely, relexicalization) to deal with uncommon or unknown words. These forms of processing, however, give rise to models that depend on the vocabulary used and are not completely neural.

In this work, we present an end-to-end sequence-to-sequence model with attention mechanism which reads and generates at a character level, no longer requiring delexicalization, tokenization, nor even lowercasing. Moreover, since characters constitute the common “building blocks” of every text, it also allows a more general approach to text generation, enabling the possibility to exploit transfer learning for training. These skills are obtained thanks to two major features: (i) the possibility to alternate between the standard generation mechanism and a copy one, which allows to directly copy input facts to produce outputs, and (ii) the use of an original training pipeline that further improves the quality of the generated texts.

We also introduce a new dataset called E2E+, designed to highlight the copying capabilities of character-based models, that is a modified version of the well-known E2E dataset used in the E2E Challenge. We tested our model according to five broadly accepted metrics (including the widely used bleu), showing that it yields competitive performance with respect to both character-based and word-based approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11210; Price includes VAT (Japan)

Softcover Book: JPY 14013; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Natural Language Generation Using Sequential Models: A Survey

Article 12 May 2023

A Systematic survey on automated text generation tools and techniques: application, evaluation, and challenges

Article 19 April 2023

Recent advances of neural text generation: Core tasks, datasets, models and challenges

Article 15 September 2020

Notes

1.
https://en.wikipedia.org/wiki/List_of_adjectival_and_demonymic_forms_for_countries_and_nations, consulted on August 30, 2018.
2.
Code and datasets are publicly available at https://github.com/marco-roberti/char-data-to-text-gen.
3.
https://pytorch.org/.
4.
www.macs.hw.ac.uk/InteractionLab/E2E/.
5.
https://github.com/tuetschek/E2E-metrics.

References

Agarwal, S., Dymetman, M.: A surprisingly effective out-of-the-box char2char model on the E2E NLG Challenge dataset. In: Proceedings of the SIGDIAL 2017 Conference, pp. 158–163. Association for Computational Linguistics, Saarbrucken (2017)
Google Scholar
Aharoni, R., Goldberg, Y., Belinkov, Y.: Improving sequence to sequence learning for morphological inflection generation: the BIU-MIT systems for the SIGMORPHON 2016 shared task for morphological reinflection. In: Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pp. 41–48. Association for Computational Linguistics, Berlin (2016)
Google Scholar
Al-Rfou, R., Choe, D., Constant, N., Guo, M., Jones, L.: Character-Level Language Modeling with Deeper Self-Attention. arXiv preprint arXiv: 1808.04444v2 (2018)
Bahdanau, D., Cho, K., Bengio, Y.: Neural Machine Translation by Jointly Learning to Align and Translate. arXiv preprint arXiv: 1409.0473v7 (2014)
Burke, R.D., Hammond, K.J., Young, B.C.: The FindMe approach to assisted browsing. IEEE Expert 12(4), 32–40 (1997)
Article Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, 25–29 October 2014, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1724–1734. ACL (2014)
Google Scholar
Doddington, G.: Automatic evaluation of machine translation quality using N-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, HLT 2002, pp. 138–145. Morgan Kaufmann Publishers Inc., San Diego (2002)
Google Scholar
Dusek, O., Jurcícek, F.: Sequence-to-sequence generation for spoken dialogue via deep syntax trees and strings. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, vol. 2: Short Papers. The Association for Computer Linguistics (2016)
Google Scholar
Goyal, R., Dymetman, M., Gaussier, É.: Natural language generation through character-based RNNs with finite-state prior knowledge. In: Calzolari, N., Matsumoto, Y., Prasad, R. (eds.) COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, Osaka, Japan, 11–16 December 2016, pp. 1083–1092. ACL (2016)
Google Scholar
Gu, J., Lu, Z., Li, H., Li, V.O.K.: Incorporating copying mechanism in sequence to- sequence learning. In: Erj, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, vol. 1: Long Papers. The Association for Computer Linguistics (2016)
Google Scholar
Karpathy, A., Li, F.: Deep Visual-Semantic Alignments for Generating Image Descriptions. arXiv preprint arXiv: 1412.2306v2 (2014)
Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv preprint arXiv: 1412.6980v9 (2014)
Lavie, A., Agarwal, A.: METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Callison-Burch, C., Koehn, P., Fordyce, C.S., Monz, C. (eds.) Proceedings of the Second Workshop on Statistical Machine Translation, WMT@ACL 2007, Prague, Czech Republic, 23 June 2007, pp. 228–231. Association for Computational Linguistics (2007)
Google Scholar
Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL 2004 Workshop, pp. 74–81. Association for Computational Linguistics, Barcelona (2004)
Google Scholar
Luong, M., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models. In: Erj, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, vol. 1: Long Papers. The Association for Computer Linguistics (2016)
Google Scholar
Luong, T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, Beijing, China, 26–31 July 2015, vol. 1: Long Papers, pp. 11–19. The Association for Computer Linguistics (2015)
Google Scholar
Mei, H., Bansal, M., Walter, M.R.: What to talk about and how? Selective generation using LSTMs with coarse-to-fine alignment. In: Knight, K., Nenkova, A., Rambow, O. (eds.) NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, 12–17 June 2016, pp. 720–730. The Association for Computational Linguistics (2016)
Google Scholar
Novikova, J., Dusek, O., Rieser, V.: The E2E dataset: new challenges for end-to-end generation. In: Jokinen, K., Stede, M., DeVault, D., Louis, A. (eds.). Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Saarbruücken, Germany, 15–17 August 2017, pp. 201–206. Association for Computational Linguistics (2017)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 6–12 July 2002, pp. 311–318. ACL (2002)
Google Scholar
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, JMLR Workshop and Conference Proceedings, pp. 1310–1318. JMLR.org (2013)
Google Scholar
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Barzilay, R., Kan, M. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, 30 July–4 August 2017, vol. 1: Long Papers, pp. 1073–1083. Association for Computational Linguistics (2017)
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Erj, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, vol. 1: Long Papers. The Association for Computer Linguistics (2016)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, Quebec, Canada, 8–13 December 2014, pp. 3104–3112 (2014)
Google Scholar
Vedantam, R., Zitnick, C.L., Parikh, D.: CIDEr: consensus-based image description evaluation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 4566–4575. IEEE Computer Society (2015)
Google Scholar
Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, 7–12 December 2015, pp. 2692–2700 (2015)
Google Scholar
Wen, T., Gasic, M., Mrksic, N., Su, P., Vandyke, D., Young, S.J.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Márquez, L., Callison-Burch, C., Su, J., Pighin, D., Marton, Y. (eds.) Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, 17–21 September 2015, pp. 1711–1721. The Association for Computational Linguistics (2015)
Google Scholar
Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)
Article Google Scholar
Wiseman, S., Shieber, S.M., Rush, A.M.: Challenges in data-to-document generation. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, 9–11 September 2017, pp. 2253–2263. Association for Computational Linguistics (2017)
Google Scholar

Download references

Acknowledgements

The activity has been partially carried on in the context of the Visiting Professor Program of the Italian Istituto Nazionale di Alta Matematica (INdAM).

Author information

Authors and Affiliations

Computer Science Department, University of Turin, Via Pessinetto, 12, 12149, Torino, Italy
Marco Roberti, Giovanni Bonetta & Rossella Cancelliere
Sorbonne Université, 4 Place Jussieu, 75005, Paris, France
Patrick Gallinari
Criteo AI Lab, 32 Rue Blanche, 75009, Paris, France
Patrick Gallinari

Authors

Marco Roberti
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Bonetta
View author publications
You can also search for this author in PubMed Google Scholar
Rossella Cancelliere
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Gallinari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Roberti .

Editor information

Editors and Affiliations

Leuphana University, Lüneburg, Germany
Ulf Brefeld
IRISA/Inria, Rennes, France
Elisa Fromont
University of Würzburg, Würzburg, Germany
Andreas Hotho
Leiden University, Leiden, The Netherlands
Arno Knobbe
ETH Zurich, Zurich, Switzerland
Marloes Maathuis
Institut National des Sciences Appliquées, Villeurbanne, France
Céline Robardet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Roberti, M., Bonetta, G., Cancelliere, R., Gallinari, P. (2020). Copy Mechanism and Tailored Training for Character-Based Data-to-Text Generation. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science(), vol 11907. Springer, Cham. https://doi.org/10.1007/978-3-030-46147-8_39

Download citation

DOI: https://doi.org/10.1007/978-3-030-46147-8_39
Published: 30 April 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46146-1
Online ISBN: 978-3-030-46147-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Copy Mechanism and Tailored Training for Character-Based Data-to-Text Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Natural Language Generation Using Sequential Models: A Survey

A Systematic survey on automated text generation tools and techniques: application, evaluation, and challenges

Recent advances of neural text generation: Core tasks, datasets, models and challenges

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Copy Mechanism and Tailored Training for Character-Based Data-to-Text Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Natural Language Generation Using Sequential Models: A Survey

A Systematic survey on automated text generation tools and techniques: application, evaluation, and challenges

Recent advances of neural text generation: Core tasks, datasets, models and challenges

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation