On the Evaluation of Generated Stylised Lyrics Using Deep Generative Models: A Preliminary Study | SpringerLink
Skip to main content

On the Evaluation of Generated Stylised Lyrics Using Deep Generative Models: A Preliminary Study

  • Conference paper
  • First Online:
Intelligent Human Computer Interaction (IHCI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13741))

Included in the following conference series:

  • 843 Accesses

Abstract

Deep generative models such as a family of GPT have exhibited super-human performance in natural language generation. However, the evaluation of the generated lacks the automated solutions and mostly requires human involved manual experiments. This paper explores the possibility of a computational means to evaluate the generated contents in an automated way. We in particular conducted the experiment with stylised lyrics which requires careful consideration in the evaluation since the lyrics generation takes into account individual characteristics of artists. To this end, we first carried out the lyrics generation through fine-tuning with K-Pop songs in three different genres using the KoGPT-2 to effectively transfer the individual artists’ persona and style. Afterwards we conducted the evaluation of stylised lyrics with another deep generative model, BERT, to measure the similarity between the lyrics generated and that in the training data, both within and between artists. The results showed the highest score between the generated and the original lyrics within the same artist but lower similarity than that between the artists, which the phenomena was not captured in a typical evaluation metric such as BLEU. Although this is a preliminary approach, this shows a possibility to automatically evaluate the generated contents in which individual characteristics were infused without human effort.

H.-J. Hong and S.-H. Kim—These authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Shawar, B.A., Atwell, E.: Different measurement metrics to evaluate a chatbot system. In: Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies, pp. 89–96 (2007)

    Google Scholar 

  2. Nagarhalli, T.P., Vaze, V., Rana, N.K.: A review of current trends in the development of chatbot systems. In: 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), pp. 706–710. IEEE (2020)

    Google Scholar 

  3. Report of chatbot market size. https://www.grandviewresearch.com/industry-analysis/chatbot-market

  4. Chandel, S., Yuying, Y., Yujie, G., Razaque, A., Yang, G.: Chatbot: efficient and utility-based platform. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) Intelligent Computing, vol. 858, pp. 109–122. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01174-1_9

    Chapter  Google Scholar 

  5. Pradhan, A., Lazar, A.: Hey Google, do you have a personality? Designing personality and personas for conversational agents. In: CUI 2021–3rd Conference on Conversational User Interfaces, pp. 1–4 (2021)

    Google Scholar 

  6. Zheng, Y., et al.: A pre-training based personalized dialogue generation model with persona-sparse data. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 9693–9700 (2020)

    Google Scholar 

  7. Lee, S.K., Yun, J.Y.: A convergence study on chatbot persona and user experience of financial service - focused on loan service. Korean Soc. Sci. Art 37(4), 257–267 (2019)

    Article  Google Scholar 

  8. KoGPT2. https://github.com/SKT-AI/KoGPT2

  9. Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)

    Google Scholar 

  10. Devlin, J., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint https://arxiv.org/abs/1810.04805 (2018)

  11. Hong, H.-J., Kim, S.-H., Lee, J.H.: Engineering a deep-generative model for lyric writing based upon a style transfer of song writers. In: Proceedings of the Korea Information Processing Society Conference. Korea Information Processing Society, pp. 741–744 (2021)

    Google Scholar 

  12. Papineni, K., et al.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  13. Zhang, T., et al.: Bertscore: evaluating text generation with bert. arXiv preprint arXiv:1904.09675 https://arxiv.org/abs/1904.09675 (2019)

Download references

Acknowledgement

This research was supported by (i) the Samsung Research Funding Center of Samsung Electronics under Project Number No. SRFC-TC1603-52, and (ii) the National Research Foundation of Korea (NRF) grant funded by the Korean government (No. 2020R1G1A1102683).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jee-Hang Lee .

Editor information

Editors and Affiliations

Appendix

Appendix

Table 2. Examples of generated lyrics and original lyrics that start the same word “시간”.

Comparing the original lyrics with the generated lyrics, Sunwoojunga has a similar structure in which English words are inserted in the middle and the same English sentences are repeated, as shown. In the case of IU, lyrics with the same ending in ‘- 요’ are being generated. Even in the case of Monsta X, both the original lyrics and the generated lyrics have a structure in which English words are included in the middle and the same English sentence structure is repeated. In conclusion, when the generated lyrics and original lyrics are compared, structurally similar lyrics are generated (see Table 2).

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hong, HJ., Kim, SH., Lee, JH. (2023). On the Evaluation of Generated Stylised Lyrics Using Deep Generative Models: A Preliminary Study. In: Zaynidinov, H., Singh, M., Tiwary, U.S., Singh, D. (eds) Intelligent Human Computer Interaction. IHCI 2022. Lecture Notes in Computer Science, vol 13741. Springer, Cham. https://doi.org/10.1007/978-3-031-27199-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-27199-1_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-27198-4

  • Online ISBN: 978-3-031-27199-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics