Deep Support Vector Machines for Speech Emotion Recognition | SpringerLink
Skip to main content

Deep Support Vector Machines for Speech Emotion Recognition

  • Conference paper
  • First Online:
Intelligent Systems Design and Applications (ISDA 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1181))

Abstract

Speech Emotions recognition has become the active research theme in speech processing and in applications based on human-machine interaction. In this work, our system is a two-stage approach, namely feature extraction and classification engine. Firstly, two sets of feature are investigated which are: the first one is extracting only 13 Mel-frequency Cepstral Coefficient (MFCC) from emotional speech samples and the second one is applying features fusions between the three features: zero crossing rate (ZCR), Teager Energy Operator (TEO), and Harmonic to Noise Rate (HNR) and MFCC features. Secondly, we use two types of classification techniques which are: the Support Vector Machines (SVM) and the k-Nearest Neighbor (k-NN) to show the performance between them. Besides that, we investigate the importance of the recent advances in machine learning including the deep kernel learning. A large set of experiments are conducted on Surrey Audio-Visual Expressed Emotion (SAVEE) dataset for seven emotions. The results of our experiments showed given good accuracy compared with the previous studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 17159
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 21449
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Emerich, S., Lupu, E.: Improving speech emotion recognition using frequency and time domain acoustic features. In: EURSAIP (2011)

    Google Scholar 

  2. Park, J.-S., Kim, J.-H., Oh, Y.-H.: Feature vector classification based speech emotion recognition for service robots. IEEE Trans. Consum. Electron. 55(3), 1590–1596 (2009)

    Article  Google Scholar 

  3. Law, J., Rennie, R.: A Dictionary of Physics, 7th edn. Oxford University Press, Oxford (2015)

    Book  Google Scholar 

  4. Zhibing, X.: Audiovisual Emotion Recognition Using Entropy estimation-based Multimodal Information Fusion. Ryerson University, Toronto (2015)

    Google Scholar 

  5. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  6. Song, P., Ou, S., Zheng, W., Jin, Y., Zhao, L.: Speech emotion recognition using transfer non-negative matrix factorization. In: Proceedings of IEEE International Conference ICASSP, pp. 5180–5184 (2016)

    Google Scholar 

  7. Papakostas, M., Siantikos, G., Giannakopoulos, T., Spyrou, E., Sgouropoulos, D.: Recognizing emotional states using speech information. In: Vlamos, P. (ed.) GeNeDis 2016. AEMB, vol. 989, pp. 155–164. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57348-9_13

    Chapter  Google Scholar 

  8. Ramdinmawii, E., Mohanta, A., Mittal, V.K.: Emotion recognition from speech signal. In: IEEE 10 Conference (TENCON), Malaysia, 5–8 November 2017 (2017)

    Google Scholar 

  9. Shi, P.: Speech emotion recognition based on deep belief network. IEEE (2018)

    Google Scholar 

  10. Latif, S., Rana, R., Younis, S., Qadir, J., Epps, J.: Transfer learning for improving speech emotion classification accuracy (2018). arXiv:1801.06353v3 [cs.CV]

  11. Aouani, H., Ben Ayed, Y.: Emotion recognition in speech using MFCC with SVM, DSVM and auto-encoder. In: IEEE 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) (2018)

    Google Scholar 

  12. Hùng, L.X.: Détection des émotions dans des énoncés audio multilingues. Institut polytechnique de Grenoble (2009)

    Google Scholar 

  13. Ferrand, C.: Speech Science: An Integrated Approach to Theory and Clinical Practice. Pearson, Boston, MA (2007)

    Google Scholar 

  14. Noroozi, F., Sapiński, T., Kamińska, D., Anbarjafari, G.: Vocal-based emotion recognition using random forests and decision tree. Int. J. Speech Technol. 20(2), 239–246 (2017). https://doi.org/10.1007/s10772-017-9396-2

    Article  Google Scholar 

  15. Swerts, M., Krahmer, E.: Gender-related differences in the production and perception of emotion. In: Proceedings of the Interspeech, pp. 334, 337 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yassine Ben Ayed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aouani, H., Ben Ayed, Y. (2021). Deep Support Vector Machines for Speech Emotion Recognition. In: Abraham, A., Siarry, P., Ma, K., Kaklauskas, A. (eds) Intelligent Systems Design and Applications. ISDA 2019. Advances in Intelligent Systems and Computing, vol 1181. Springer, Cham. https://doi.org/10.1007/978-3-030-49342-4_39

Download citation

Publish with us

Policies and ethics