Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

Su, Feng; Xue, Hao

doi:10.1007/978-3-319-51811-4_13

Feng Su¹⁸ &
Hao Xue¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Included in the following conference series:

International Conference on Multimedia Modeling

3307 Accesses

Abstract

Automatic music mood classification is an important and challenging problem in the field of music information retrieval (MIR) and has attracted growing attention from variant research areas. In this paper, we proposed a novel multimodal method for music mood classification that exploits the complementarity of the lyrics and audio information of music to enhance the classification accuracy. We first extract descriptive sentence-level lyrics and audio features from the music. Then, we project the paired low-level features of two different modalities into a learned common discriminative latent space, which not only eliminates between modality heterogeneity, but also increases the discriminability of the resulting descriptions. On the basis of the latent representation of music, we employ a graph learning based multi-modal classification model for music mood, which takes the cross-modality similarity between local audio and lyrics descriptions of music into account for effective exploitation of correlations between different modalities. The acquired predictions of mood category for every sentence of music are then aggregated by a simple voting scheme. The effectiveness of the proposed method has been demonstrated in the experiments on a real dataset composed of more than 3,000 min of music and corresponding lyrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 5719; Price includes VAT (Japan)

Softcover Book: JPY 7149; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Music Mood Categorization: A Survey

Multimodal Music Mood Classification Framework for Kokborok Music

Comparative Analysis of Music Mood Classification Methods

References

Besson, M., Faita, F., Peretz, I., Bonnel, A.M., Requin, J.: Singing in the brain: independence of lyrics and tunes. Psychol. Sci. 9(6), 494–498 (1998)
Article Google Scholar
Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimedia 13(2), 303–319 (2011)
Article Google Scholar
He, X., Yan, S., Hu, Y., Niyogi, P., Zhang, H.J.: Face recognition using Laplacian faces. IEEE TPAMI 27(3), 328–340 (2005)
Article Google Scholar
Hu, X., Downie, J.S., Ehmann, A.F.: Lyric text mining in music mood classification. In: ISMIR 2009, pp. 411–416 (2009)
Google Scholar
Hu, Y., Ogihara, M.: Identifying accuracy of social tags by using clustering representations of song lyrics. In: ICMLA 2012, vol. 1, pp. 582–585, December 2012
Google Scholar
Kim, S., E.M., Migneco, R., Youngmoo, E.: Music emotion recognition: a state of the art review. In: ISMIR 2010, pp. 255–266 (2010)
Google Scholar
Laurier, C., Grivolla, J., Herrera, P.: Multimodal music mood classification using audio and lyrics. In: ICMLA 2008, pp. 688–693 (2008)
Google Scholar
Lu, L., Liu, D., Zhang, H.J.: Automatic mood detection and tracking of music audio signals. IEEE TASLP 14(1), 5–18 (2006)
Google Scholar
Music mood classification dataset. http://cs.nju.edu.cn/sufeng/data/musicmood.htm
Panda, R., Paiva, R.P.: Mirex 2012: mood classification tasks submission (2012)
Google Scholar
Ren, J.M., Wu, M.J., Jang, J.S.R.: Automatic music mood classification based on timbre and modulation features. IEEE Trans. Affect. Comput. 6(3), 236–246 (2015)
Article Google Scholar
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980)
Article Google Scholar
Sharma, A., Kumar, A., III, H.D., Jacobs, D.W.: Generalized multiview analysis: a discriminative latent space. In: CVPR 2012, pp. 2160–2167, June 2012
Google Scholar
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: EMNLP 2011, pp. 151–161 (2011)
Google Scholar
Su, D., Fung, P., Auguin, N.: Multimodal music emotion classification using AdaBoost with decision stumps. In: ICASSP 2013, pp. 3447–3451 (2013)
Google Scholar
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394 (2010)
Google Scholar
Xue, H., Xue, L., Su, F.: Multimodal music mood classification by fusion of audio and lyrics. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015. LNCS, vol. 8936, pp. 26–37. Springer, Heidelberg (2015). doi:10.1007/978-3-319-14442-9_3
Google Scholar
Yang, Y.H., Chen, H.H.: Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3(3), 40:1–40:30 (2012)
Article MathSciNet Google Scholar
Yang, Y.-H., Lin, Y.-C., Cheng, H.-T., Liao, I.-B., Ho, Y.-C., Chen, H.H.: Toward multi-modal music emotion classification. In: Huang, Y.-M.R., Xu, C., Cheng, K.-S., Yang, J.-F.K., Swamy, M.N.S., Li, S., Ding, J.-W. (eds.) PCM 2008. LNCS, vol. 5353, pp. 70–79. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89796-5_8
Chapter Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Scholkopf, B.: Learning with local and global consistency. Adv. Neural Inf. Process. Syst. 16(16), 321–328 (2004)
Google Scholar

Download references

Acknowledgments

Research supported by the National Science Foundation of China under Grant Nos. 61003113, 61672273 and 61321491.

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Feng Su & Hao Xue

Authors

Feng Su
View author publications
You can also search for this author in PubMed Google Scholar
Hao Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Su .

Editor information

Editors and Affiliations

CNRS–IRISA, Rennes, France
Laurent Amsaleg
Reykjavík University, Reykjavik, Iceland
Gylfi Þór Guðmundsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
Reykjavik University, Reykjavik, Ireland
Björn Þór Jónsson
National Institute of Informatics, Tokyo, Japan
Shin’ichi Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, F., Xue, H. (2017). Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-51811-4_13
Published: 31 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51810-7
Online ISBN: 978-3-319-51811-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Music Mood Categorization: A Survey

Multimodal Music Mood Classification Framework for Kokborok Music

Comparative Analysis of Music Mood Classification Methods

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Music Mood Categorization: A Survey

Multimodal Music Mood Classification Framework for Kokborok Music

Comparative Analysis of Music Mood Classification Methods

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation