MusIAC: An Extensible Generative Framework for Music Infilling Applications with Multi-level Control

Guo, Rui; Simpson, Ivor; Kiefer, Chris; Magnusson, Thor; Herremans, Dorien

doi:10.1007/978-3-031-03789-4_22

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13221))

Included in the following conference series:

International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar)

2266 Accesses
1 Altmetric

Abstract

We present a novel music generation framework for music infilling, with a user friendly interface. Infilling refers to the task of generating musical sections given the surrounding multi-track music. The proposed transformer-based framework is extensible for new control tokens as the added music control tokens such as tonal tension per bar and track polyphony level in this work. We explore the effects of including several musically meaningful control tokens, and evaluate the results using objective metrics related to pitch and rhythm. Our results demonstrate that adding additional control tokens helps to generate music with stronger stylistic similarities to the original music. It also provides the user with more control to change properties like the music texture and tonal tension in each bar compared to previous research which only provided control for track density. We present the model in a Google Colab notebook to enable interactive generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 14871; Price includes VAT (Japan)

Softcover Book: JPY 18589; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LyricJam Sonic: A Generative System for Real-Time Composition and Musical Improvisation

MUSIB: musical score inpainting benchmark

Article Open access 05 May 2023

PopMash: an automatic musical-mashup system using computation of musical and lyrical agreement for transitions

Article 13 May 2020

Notes

1.
https://github.com/ruiguo-bio/MusIAC.

References

Akama, T.: A contextual latent space model: subsequence modulation in melodic sequence. In: Proceedings of the 22nd International Society for Music Information Retrieval Conference, pp. 27–34 (2021)
Google Scholar
Bazin, T., Hadjeres, G.: NONOTO: a model-agnostic web interface for interactive music composition by inpainting. arXiv:1907.10380 (2019)
Briot, J.P., Hadjeres, G., Pachet, F.: Deep Learning Techniques for Music Generation. Springer, Heidelberg (2020). https://doi.org/10.1007/978-3-319-70163-9
Book Google Scholar
Brunner, G., Wang, Y., Wattenhofer, R., Zhao, S.: Symbolic music genre transfer with CycleGAN. In: 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI), Volos, Greece, pp. 786–793 (2018)
Google Scholar
Chew, E.: The spiral array: an algorithm for determining key boundaries. In: Anagnostopoulou, C., Ferrand, M., Smaill, A. (eds.) ICMAI 2002. LNCS (LNAI), vol. 2445, pp. 18–31. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45722-4_4
Chapter Google Scholar
Chou, Y.H., Chen, I., Chang, C.J., Ching, J., Yang, Y.H., et al.: MidiBERT-piano: large-scale pre-training for symbolic music understanding. arXiv:2107.05223 (2021)
Cuthbert, M.S., Ariza, C.: music21: a toolkit for computer-aided musicology and symbolic music data. In: Proceedings of the 11th International Society for Music Information Retrieval Conference, Utrecht, Netherlands, pp. 637–642 (2010)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2021)
Ens, J., Pasquier, P.: MMM: exploring conditional multi-track music generation with the transformer. arXiv:2008.06048 (2020)
Guo, R., Herremans, D., Magnusson, T.: Midi miner - a Python library for tonal tension and track classification. arXiv:1910.02049 (2019)
Guo, R., Simpson, I., Magnusson, T., Kiefer, C., Herremans, D.: A variational autoencoder for music generation controlled by tonal tension. arXiv preprint arXiv:2010.06230 (2020)
Hadjeres, G., Crestel, L.: The piano inpainting application. arXiv:2107.05944 (2021)
Herremans, D., Chew, E.: Tension ribbons: quantifying and visualising tonal tension. In: 2nd International Conference on Technologies for Music Notation and Representation, Cambridge, UK, pp. 8–18 (2016)
Google Scholar
Herremans, D., Chew, E.: MorpheuS: generating structured music with constrained patterns and tension. IEEE Trans. Affect. Comput. 10(4), 510–523 (2017)
Article Google Scholar
Hsiao, W.Y., Liu, J.Y., Yeh, Y.C., Yang, Y.H.: Compound word transformer: learning to compose full-song music over dynamic directed hypergraphs. arXiv:2101.02402 (2021)
Huang, C.A., Cooijmans, T., Roberts, A., Courville, A.C., Eck, D.: Counterpoint by convolution. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, pp. 211–218 (2017)
Google Scholar
Huang, C.A., et al.: Music transformer: generating music with long-term structure. In: 7th International Conference on Learning Representations, New Orleans, USA (2019)
Google Scholar
Huang, Y.S., Yang, Y.H.: Pop music transformer: beat-based modeling and generation of expressive pop piano compositions. In: Proceedings of the 28th ACM International Conference on Multimedia, Seattle, USA, pp. 1180–1188 (2020)
Google Scholar
Ji, S., Luo, J., Yang, X.: A comprehensive survey on deep music generation: multi-level representations, algorithms, evaluations, and future directions. arXiv:2011.06801 (2020)
Louie, R., Coenen, A., Huang, C.Z., Terry, M., Cai, C.J.: Novice-AI music co-creation via AI-steering tools for deep generative models. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, USA, pp. 1–13 (2020)
Google Scholar
Muhamed, A., et al.: Transformer-GAN: symbolic music generation using a learned loss. In: 4th Workshop on Machine Learning for Creativity and Design at NeurIPS 2020 (2020)
Google Scholar
Oore, S., Simon, I., Dieleman, S., Eck, D., Simonyan, K.: This time with feeling: learning expressive musical performance. Neural Comput. Appl. 32(4), 955–967 (2020)
Article Google Scholar
Pati, A., Lerch, A.: Is disentanglement enough? On latent representations for controllable music generation. In: Proceedings of the 22nd International Society for Music Information Retrieval Conference, pp. 517–524 (2021)
Google Scholar
Pati, A., Lerch, A., Hadjeres, G.: Learning to traverse latent spaces for musical score inpainting. In: Proceedings of the 20th International Society for Music Information Retrieval Conference, Delft, The Netherlands, pp. 343–351 (2019)
Google Scholar
Raffel, C.: Learning-based methods for comparing sequences, with applications to audio-to-midi alignment and matching. Ph.D. thesis, Columbia University (2016)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020)
MathSciNet MATH Google Scholar
Ren, Y., He, J., Tan, X., Qin, T., Zhao, Z., Liu, T.Y.: PopMAG: pop music accompaniment generation. In: Proceedings of the 28th ACM International Conference on Multimedia, Seattle, USA, pp. 1198–1206 (2020)
Google Scholar
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.: MASS: masked sequence to sequence pre-training for language generation. In: Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, vol. 97, pp. 5926–5936 (2019)
Google Scholar
Tan, H.H., Herremans, D.: Music FaderNets: controllable music generation based on high-level features via low-level feature modelling. In: Proceedings of the 21st International Society for Music Information Retrieval Conference, Montréal, Canada, pp. 109–116 (2020)
Google Scholar
Tatar, K., Pasquier, P.: Musical agents: a typology and state of the art towards musical metacreation. J. New Music Res. 48, 105–56 (2019)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. arXiv:1706.03762 (2017)
Yang, L.C., Lerch, A.: On the evaluation of generative models in music. Neural Comput. Appl. 32(9), 4773–4784 (2020)
Article Google Scholar
Zeng, M., Tan, X., Wang, R., Ju, Z., Qin, T., Liu, T.Y.: MusicBERT: symbolic music understanding with large-scale pre-training. arXiv:2106.05630 (2021)
Zixun, G., Makris, D., Herremans, D.: Hierarchical recurrent neural networks for conditional melody generation with long-term structure. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Google Scholar

Download references

Acknowledgement

This work is funded by Chinese scholarship Council and Singapore Ministry of Education Grant no. MOE2018-T2-2-161.

Author information

Authors and Affiliations

Department of Music, University of Sussex, Brighton, UK
Rui Guo, Chris Kiefer & Thor Magnusson
Department of Informatics, University of Sussex, Brighton, UK
Ivor Simpson
Information Systems Technology and Design, Singapore University of Technology and Design, Singapore, Singapore
Dorien Herremans

Authors

Rui Guo
View author publications
You can also search for this author in PubMed Google Scholar
Ivor Simpson
View author publications
You can also search for this author in PubMed Google Scholar
Chris Kiefer
View author publications
You can also search for this author in PubMed Google Scholar
Thor Magnusson
View author publications
You can also search for this author in PubMed Google Scholar
Dorien Herremans
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Guo .

Editor information

Editors and Affiliations

University of Coimbra, Coimbra, Portugal
Tiago Martins
University of A Coruña, A Coruña, Spain
Nereida Rodríguez-Fernández
University of Coimbra, Coimbra, Portugal
Sérgio M. Rebelo

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 4694 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, R., Simpson, I., Kiefer, C., Magnusson, T., Herremans, D. (2022). MusIAC: An Extensible Generative Framework for Music Infilling Applications with Multi-level Control. In: Martins, T., Rodríguez-Fernández, N., Rebelo, S.M. (eds) Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2022. Lecture Notes in Computer Science, vol 13221. Springer, Cham. https://doi.org/10.1007/978-3-031-03789-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-03789-4_22
Published: 15 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-03788-7
Online ISBN: 978-3-031-03789-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MusIAC: An Extensible Generative Framework for Music Infilling Applications with Multi-level Control