A Combination of Hand-Crafted and Hierarchical High-Level Learnt Feature Extraction for Music Genre Classification | SpringerLink
Skip to main content

A Combination of Hand-Crafted and Hierarchical High-Level Learnt Feature Extraction for Music Genre Classification

  • Conference paper
Artificial Neural Networks and Machine Learning – ICANN 2013 (ICANN 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8131))

Included in the following conference series:

Abstract

In this paper, we propose a new approach for automatic music genre classification which relies on learning a feature hierarchy with a deep learning architecture over hand-crafted feature extracted from an audio signal. Unlike the state-of-the-art approaches, our scheme uses an unsupervised learning algorithm based on Deep Belief Networks (DBN) learnt on block-wise MFCC (that we treat as 2D images), followed by a supervised learning algorithm for fine-tuning the extracted features. Experiments performed on the GTZAN dataset show that the proposed scheme clearly outperforms the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Tzanetakis, G.: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10(5), 293–302 (2002)

    Article  Google Scholar 

  2. Lidy, T., Rauber, A.: Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In: International Society for Music Information Retrieval Conference, pp. 34–41 (2005)

    Google Scholar 

  3. Tsuji, Y., Akahori, K., Nishikata, A.: The estimation of music genre using neural network and its educational use. In: International Conference on Computer-Assisted Instruction, pp. 158–162 (2000)

    Google Scholar 

  4. Bergstra, J., Kgl, B.: Aggregate features and adaboost for music classification. Machine Learning 2(65), 473–484 (2006)

    Article  Google Scholar 

  5. Seyerlehner, K., Schedl, M., Pohle, T., Knees, P.: Using block-level features for genre classification, tag, classification and music similarity estimation. In: IMEX (2010)

    Google Scholar 

  6. Costa, Y., Oliveira, L., Koerich, A., Gouyon, F.: Music genre recognition using spectograms. In: WSSIP 2010, pp. 151–154 (2010)

    Google Scholar 

  7. Hua, B., Fu-long, M., Li-cheng, J.: Research on computation of glcm of image texture (2006)

    Google Scholar 

  8. Li, T.L., Chan, A., Chun, A.: Automatic musical pattern feature extraction using convolutional neural network. In: IMECS 2010 (2010)

    Google Scholar 

  9. Hinton, G.: To recognize shapes, first learn to generate images. Progress in Brain Research 165, 535–547 (2006)

    Article  Google Scholar 

  10. Hamel, P., Eck, D.: Learning features from music audio with deep belief networks. In: International Society for Music Information Retrieval, pp. 339–344 (2010)

    Google Scholar 

  11. Ranzato, M., Boureau, Y.-L., Chopra, S., Lecun, Y.: A unified energy-based framework for unsupervised learning. Journal of Machine Learning Research 2, 371–379 (2007)

    Google Scholar 

  12. Bridle, J., Brown, M.: An experimental word recognition system, jsru report no 1003. Joint Speech Research Unit, Ruislip, England, Tech. Rep. (1974)

    Google Scholar 

  13. Li, T.L., Chan, A.: Genre classification and the invariance of mfcc features to key and tempo. In: International Conference on MultiMedia Modeling (2011)

    Google Scholar 

  14. Li, T.L., Tzanetakis, G.: Factors in automatic musical genre classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2003)

    Google Scholar 

  15. Chang, K., Jang, J., Ilioupoulos, C.: Music genre classification via compressive sampling. In: International Society for Music Information Retrieval, pp. 387–392 (2010)

    Google Scholar 

  16. Panagakis, Y., Kotropoulos, C., Arce, G.: Music genre classification using locality preserving non-negative tensor factorization and sparse representations. In: International Society for Music Information Retrieval, pp. 249–254 (2009)

    Google Scholar 

  17. Henaff, M., Jarett, K., Kavukcuoglu, K., LeCun, Y.: Unsupervised learning of sparse features for scalable audio classification. In: International Society for Music Information Retrieval (2011)

    Google Scholar 

  18. Li, T.L., Ogihara, M., Li, Q.: A comparative study on content-based music genre classification. In: ACM SIGIR Conference on Research and Development in Information Retrieval (2003)

    Google Scholar 

  19. Bergstra, J., Mandel, M., Eck, D.: Scalable genre and tag prediction using spectral covariance. In: International Society for Music Information Retrieval (2010)

    Google Scholar 

  20. Smith, E., Lewicki, M.: Efficient auditory coding. Nature (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Martel, J., Nakashika, T., Garcia, C., Idrissi, K. (2013). A Combination of Hand-Crafted and Hierarchical High-Level Learnt Feature Extraction for Music Genre Classification. In: Mladenov, V., Koprinkova-Hristova, P., Palm, G., Villa, A.E.P., Appollini, B., Kasabov, N. (eds) Artificial Neural Networks and Machine Learning – ICANN 2013. ICANN 2013. Lecture Notes in Computer Science, vol 8131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40728-4_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40728-4_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40727-7

  • Online ISBN: 978-3-642-40728-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics