SmartEDU: Accelerating Slide Deck Production with Natural Language Processing | SpringerLink
Skip to main content

SmartEDU: Accelerating Slide Deck Production with Natural Language Processing

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2023)

Abstract

Slide decks are a common medium for presenting a topic. To reduce the time required for their preparation, we present SmartEDU, a platform for drafting slides for a textual document, and the research that lead to its development. Drafts are Powerpoint files generated in three steps: pre-processing, for acquiring or discovering section titles; summarization, for compressing the contents of each section; slide composition, for organizing the summaries into slides. The resulting file may be further edited by the user. Several summarization methods were experimented in public datasets of presentations and in Wikipedia articles. Based on automatic evaluation measures and collected human opinions, we conclude that a Distillbart model is preferred to unsupervised summarization, especially when it comes to overall draft quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 10295
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12869
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.microsoft.com/microsoft-365/powerpoint.

  2. 2.

    https://prezi.com.

  3. 3.

    https://mindflow.pt/.

  4. 4.

    https://github.com/kermitt2/grobid.

  5. 5.

    https://github.com/pymupdf/PyMuPDF.

  6. 6.

    https://python-pillow.org/.

  7. 7.

    https://spacy.io/.

  8. 8.

    https://github.com/kermitt2/grobid.

  9. 9.

    https://huggingface.co/sshleifer/distilbart-cnn-12-6.

  10. 10.

    https://huggingface.co/google/pegasus-cnn_dailymail.

  11. 11.

    https://huggingface.co/csebuetnlp/mT5_multilingual_XLSum.

  12. 12.

    For English, roberta-large; for other languages, bert-base-multilingual-cased.

  13. 13.

    Carnation_Revolution, Cristiano_Ronaldo, Coimbra, Europe, Luís_de_Camões, Programming_language, Pythagorean_theorem, University_of_Coimbra, Queen_(band), Star_Wars.

References

  1. Bhandare, A.A., Awati, C.J., Kharade, S.: Automatic era: presentation slides from academic paper. In: 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT), pp. 809–814 (2016)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  3. Bond, F., Paik, K.: A survey of wordnets and their licenses. Small 8(4), 5 (2012)

    Google Scholar 

  4. Deerwester, S.C., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.A.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)

    Article  Google Scholar 

  5. Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res. 22(1), 457–479 (2004)

    Google Scholar 

  6. Fu, T.J., Wang, W.Y., McDuff, D.J., Song, Y.: DOC2PPT: automatic presentation slides generation from scientific documents. ArXiv abs/2101.11796 (2021)

    Google Scholar 

  7. Grootendorst, M.: KeyBERT: minimal keyword extraction with BERT (2020). https://doi.org/10.5281/zenodo.4461265

  8. Hanaue, K., Ishiguro, Y., Watanabe, T.: Composition method of presentation slides using diagrammatic representation of discourse structure. Int. J. Knowl. Web Intell. 3, 237–255 (2012)

    Article  Google Scholar 

  9. Hasan, T., et al.: XL-sum: large-scale multilingual abstractive summarization for 44 languages. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 4693–4703. ACL (2021)

    Google Scholar 

  10. Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)

    Google Scholar 

  11. Hu, Y., Wan, X.: PPSGen: learning to generate presentation slides for academic papers. In: Proceedings of 23rd International Joint Conference on Artificial Intelligence (IJCAI) (2013)

    Google Scholar 

  12. Kurdi, G., Leo, J., Parsia, B., Sattler, U., Al-Emari, S.: A systematic review of automatic question generation for educational purposes. Int. J. Artif. Intell. Educ. 30, 121–204 (2020)

    Article  Google Scholar 

  13. Li, D.W., Huang, D., Ma, T., Lin, C.Y.: Towards topic-aware slide generation for academic papers with unsupervised mutual learning. In: Proceedings of AAAI Conference on Artificial Intelligence. AAAI (2021)

    Google Scholar 

  14. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81. ACL, Barcelona (2004)

    Google Scholar 

  15. Mathivanan, H., Jayaprakasam, M., Prasad, K.G., Geetha, T.V.: Document summarization and information extraction for generation of presentation slides. In: 2009 International Conference on Advances in Recent Technologies in Communication and Computing, pp. 126–128 (2009)

    Google Scholar 

  16. Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411. ACL, Barcelona (2004)

    Google Scholar 

  17. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(1), 5485–5551 (2020)

    MathSciNet  MATH  Google Scholar 

  18. Sathiyamurthy, K., Geetha, T.V.: Automatic organization and generation of presentation slides for e-learning. Int. J. Dist. Educ. Technol. 10, 35–52 (2012)

    Article  Google Scholar 

  19. See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. CoRR (2017). http://arxiv.org/abs/1704.04368

  20. Sefid, A., Mitra, P., Wu, J., Giles, C.L.: Extractive research slide generation using windowed labeling ranking. In: Proceedings of 2nd Workshop on Scholarly Document Processing, pp. 91–96. ACL (2021)

    Google Scholar 

  21. Sefid, A., Wu, J.: Automatic slide generation for scientific papers. In: 3rd International Workshop on Capturing Scientific Knowledge Co-located with K-CAP 2019, SciKnow@ K-CAP 2019 (2019)

    Google Scholar 

  22. Sellam, T., Das, D., Parikh, A.: BLEURT: learning robust metrics for text generation. In: Proceedings of 58th Annual Meeting of the Association for Computational Linguistics, pp. 7881–7892. ACL (2020)

    Google Scholar 

  23. Sethi, P., Sonawane, S.S., Khanwalker, S., Keskar, R.B.: Automatic text summarization of news articles. In: 2017 International Conference on Big Data, IoT and Data Science (BID), pp. 23–29 (2017)

    Google Scholar 

  24. Shaikh, P.J., Deshmukh, R.A.: Automatic slide generation for academic paper using PPSGen method. In: International Journal of Technical Research and Applications, pp. 199–203 (2016)

    Google Scholar 

  25. Shibata, T., Kurohashi, S.: Automatic slide generation based on discourse structure analysis. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 754–766. Springer, Heidelberg (2005). https://doi.org/10.1007/11562214_66

    Chapter  Google Scholar 

  26. Shleifer, S., Rush, A.M.: Pre-trained summarization distillation. CoRR abs/2010.13002 (2020). https://arxiv.org/abs/2010.13002

  27. Sravanthi, M., Chowdary, C.R., Kumar, P.S.: QueSTS: a query specific text summarization system. In: Proceedings of 21st International Florida Artificial Intelligence Research Society Conference (FLAIRS) (2008)

    Google Scholar 

  28. Sravanthi, M., Chowdary, C.R., Kumar, P.S.: SlidesGen: automatic generation of presentation slides for a technical paper using summarization. In: Proceedings of 22nd International Florida Artificial Intelligence Research Society Conference (FLAIRS) (2009)

    Google Scholar 

  29. Sun, E., Hou, Y., Wang, D., Zhang, Y., Wang, N.X.R.: D2S: document-to-slide generation via query-based text summarization. In: Proceedings of 2021 Conference of North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1405–1418. ACL (2021)

    Google Scholar 

  30. Utiyama, M., Hasida, K.: Automatic slide presentation from semantically annotated documents. In: Proceedings of Workshop on Coreference and Its Applications (COREF@ACL). ACL (1999)

    Google Scholar 

  31. Zhang, J., Zhao, Y., Saleh, M., Liu, P.: PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. In: International Conference on Machine Learning, pp. 11328–11339. PMLR (2020)

    Google Scholar 

  32. Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT (2020). https://arxiv.org/abs/1904.09675

Download references

Acknowledgements

This work was funded by: project SmartEDU (CENTRO-01-0247-FEDER-072620), co-financed by FEDER, through PT2020, and by the Regional Operational Programme Centro 2020; and through the FCT – Foundation for Science and Technology, I.P., within the scope of the project CISUC – UID/CEC/00326/2020 and by the European Social Fund, through the Regional Operational Program Centro 2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria João Costa .

Editor information

Editors and Affiliations

A Appendix: Example Drafts

A Appendix: Example Drafts

Fig. 3.
figure 3

Presentation draft generated with Distillbart for the Wikipedia article “SWOT analysis” (https://en.wikipedia.org/wiki/SWOT_analysis), as of June 2022, considering the section titles. Read from left to right.

Fig. 4.
figure 4

Presentation draft generated with Distillbart for the Wikipedia article “SWOT analysis” (https://en.wikipedia.org/wiki/SWOT_analysis), as of June 2022, with sections discovered automatically and generated titles. Read from left to right.

Fig. 5.
figure 5

Slides generated with TextRank for the article “Europe”, in the English Wikipedia, as of June, 19, 2022. Read from left to right. Article Link: https://en.wikipedia.org/wiki/Europe

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Costa, M.J., Amaro, H., Gonçalo Oliveira, H. (2023). SmartEDU: Accelerating Slide Deck Production with Natural Language Processing. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham. https://doi.org/10.1007/978-3-031-35320-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-35320-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-35319-2

  • Online ISBN: 978-3-031-35320-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics