Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space | SpringerLink
Skip to main content

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Included in the following conference series:

  • 3307 Accesses

Abstract

Automatic music mood classification is an important and challenging problem in the field of music information retrieval (MIR) and has attracted growing attention from variant research areas. In this paper, we proposed a novel multimodal method for music mood classification that exploits the complementarity of the lyrics and audio information of music to enhance the classification accuracy. We first extract descriptive sentence-level lyrics and audio features from the music. Then, we project the paired low-level features of two different modalities into a learned common discriminative latent space, which not only eliminates between modality heterogeneity, but also increases the discriminability of the resulting descriptions. On the basis of the latent representation of music, we employ a graph learning based multi-modal classification model for music mood, which takes the cross-modality similarity between local audio and lyrics descriptions of music into account for effective exploitation of correlations between different modalities. The acquired predictions of mood category for every sentence of music are then aggregated by a simple voting scheme. The effectiveness of the proposed method has been demonstrated in the experiments on a real dataset composed of more than 3,000 min of music and corresponding lyrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Besson, M., Faita, F., Peretz, I., Bonnel, A.M., Requin, J.: Singing in the brain: independence of lyrics and tunes. Psychol. Sci. 9(6), 494–498 (1998)

    Article  Google Scholar 

  2. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimedia 13(2), 303–319 (2011)

    Article  Google Scholar 

  3. He, X., Yan, S., Hu, Y., Niyogi, P., Zhang, H.J.: Face recognition using Laplacian faces. IEEE TPAMI 27(3), 328–340 (2005)

    Article  Google Scholar 

  4. Hu, X., Downie, J.S., Ehmann, A.F.: Lyric text mining in music mood classification. In: ISMIR 2009, pp. 411–416 (2009)

    Google Scholar 

  5. Hu, Y., Ogihara, M.: Identifying accuracy of social tags by using clustering representations of song lyrics. In: ICMLA 2012, vol. 1, pp. 582–585, December 2012

    Google Scholar 

  6. Kim, S., E.M., Migneco, R., Youngmoo, E.: Music emotion recognition: a state of the art review. In: ISMIR 2010, pp. 255–266 (2010)

    Google Scholar 

  7. Laurier, C., Grivolla, J., Herrera, P.: Multimodal music mood classification using audio and lyrics. In: ICMLA 2008, pp. 688–693 (2008)

    Google Scholar 

  8. Lu, L., Liu, D., Zhang, H.J.: Automatic mood detection and tracking of music audio signals. IEEE TASLP 14(1), 5–18 (2006)

    Google Scholar 

  9. Music mood classification dataset. http://cs.nju.edu.cn/sufeng/data/musicmood.htm

  10. Panda, R., Paiva, R.P.: Mirex 2012: mood classification tasks submission (2012)

    Google Scholar 

  11. Ren, J.M., Wu, M.J., Jang, J.S.R.: Automatic music mood classification based on timbre and modulation features. IEEE Trans. Affect. Comput. 6(3), 236–246 (2015)

    Article  Google Scholar 

  12. Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980)

    Article  Google Scholar 

  13. Sharma, A., Kumar, A., III, H.D., Jacobs, D.W.: Generalized multiview analysis: a discriminative latent space. In: CVPR 2012, pp. 2160–2167, June 2012

    Google Scholar 

  14. Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: EMNLP 2011, pp. 151–161 (2011)

    Google Scholar 

  15. Su, D., Fung, P., Auguin, N.: Multimodal music emotion classification using AdaBoost with decision stumps. In: ICASSP 2013, pp. 3447–3451 (2013)

    Google Scholar 

  16. Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394 (2010)

    Google Scholar 

  17. Xue, H., Xue, L., Su, F.: Multimodal music mood classification by fusion of audio and lyrics. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015. LNCS, vol. 8936, pp. 26–37. Springer, Heidelberg (2015). doi:10.1007/978-3-319-14442-9_3

    Google Scholar 

  18. Yang, Y.H., Chen, H.H.: Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3(3), 40:1–40:30 (2012)

    Article  MathSciNet  Google Scholar 

  19. Yang, Y.-H., Lin, Y.-C., Cheng, H.-T., Liao, I.-B., Ho, Y.-C., Chen, H.H.: Toward multi-modal music emotion classification. In: Huang, Y.-M.R., Xu, C., Cheng, K.-S., Yang, J.-F.K., Swamy, M.N.S., Li, S., Ding, J.-W. (eds.) PCM 2008. LNCS, vol. 5353, pp. 70–79. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89796-5_8

    Chapter  Google Scholar 

  20. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Scholkopf, B.: Learning with local and global consistency. Adv. Neural Inf. Process. Syst. 16(16), 321–328 (2004)

    Google Scholar 

Download references

Acknowledgments

Research supported by the National Science Foundation of China under Grant Nos. 61003113, 61672273 and 61321491.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Su .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Su, F., Xue, H. (2017). Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51811-4_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51810-7

  • Online ISBN: 978-3-319-51811-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics