Abstract
Computer vision researchers are now studying the process of recognizing emotions from facial expressions. Our system is based on his three-step method in this article, which includes face detection, feature extraction, and classification. Capture a photo/video to get facial recognition information and find the face area in this image. Face extraction uses the Viola-Jones algorithm to find reflective areas (eyes, mouth, nose, and temples) in specific faces. In order to extract the faces, we have built a database of frontal face images. We offer two systems. The first facial emotion detection system is based on classification using raw facial images, and the second extracts the oriented gradient histogram (HOG) from facial images. For the classification phase, we use three classifiers: support vector machines (SVM), Convolutional Neural Network (CNN) and hybrid CNN-SVM. To increase the performance of our facial emotion recognition system, we propose to merge the two CNN outputs of the two systems to create deep features that are merged as inputs of two classifiers (MLP and SVM). The experiments are performed the Ryerson Multimedia Laboratory (RML) dataset. The objective is to compare the performances of these methods and to identify the most suitable approach. Our experimental results showed good accuracy compared to previous studies.
Similar content being viewed by others
Availability of data and materials
Not applicable.
References
Tian Y-L, Kanade T, Cohn JF (2002) Facial Expression Analysis. Springer, New York, pp 247–275
Insaf A, Ouahabi A, Benzaoui A, Taleb-ahmed A (2020) Past present and future of face recognition: a review. Electronics 9:1188. https://doi.org/10.3390/electronics9081188
Zhang L, Linjun S, Lina Y, Xiaoli D, Jinchao C, Weiwei C, Chen W, Xin N (2022) Arface: attention-aware and regularization for face recognition with reinforcement learning. IEEE Trans Biometrics, Behav, Identity Sci. https://doi.org/10.1109/tbiom.2021.3104014
Gao H, Ma B (2020) A robust improved network for facial expression recognition. Front Signal Process. https://doi.org/10.22606/fsp.2020.44001
Russell JA (2017) Toward a broader perspective on facial expressions. The Sci Facial Exp, 93–105
Tian Y, Kanade T, Cohn J (2011) Facial Expression Recognition, pp. 487–519. https://doi.org/10.1007/978-0-85729-932-1_19
Valstar M, Zafeiriou S, Pantic M (2017) Facial Actions as Social Signals, pp. 123–154. https://doi.org/10.1017/9781316676202.011
Franco L, Treves A (2001) A neural network facial expression recognition system using unsupervised local processing
Uddin MZ, Lee JJ, Kim T-H (2009) An enhanced independent component-based human facial expression recognition from video. Consumer Electr, IEEE Trans 55:2216–2224. https://doi.org/10.1109/TCE.2009.5373791
Hegde G (2017) Subspace based expression recognition using combinational gabor based feature fusion. Int J Image, Gr Signal Process 9:50–60. https://doi.org/10.5815/ijigsp.2017.01.07
Khan S, Hussain A, Usman M (2018) Reliable facial expression recognition for multi-scale images using weber local binary image based cosine transform features. Multimedia Tools Appl. https://doi.org/10.1007/s11042-016-4324-z
Noroozi F, Marjanovic M, Njeguš A, Escalera S, Anbarjafari G (2017) Audio-visual emotion recognition in video clips. IEEE Transactions on Affective Computing PP, 60–70 https://doi.org/10.1109/TAFFC.2017.2713783
García H, Álvarez M, Orozco A (2017) Dynamic facial landmarking selection for emotion recognition using gaussian processes. J Multimodal User Interf. https://doi.org/10.1007/s12193-017-0256-9
Noushin H, Bashirov E, Demirel H (2021) Video-based person-dependent and person-independent facial emotion recognition. Signal, Image Video Process 15(5):1049–1056
Wang Y, Guan L (2008) Recognizing human emotional state from audiovisual signals*. Multimed, IEEE Trans 10:936–946. https://doi.org/10.1109/TMM.2008.927665
Aljaloud AS, Ullah AAH (2020) Facial emotion recognition using neighborhood. Int J Adv Comput Sci Appl 11:299–306
Yang D, Alsadoon A, Prasad PC, Singh AK, Elchouemi A (2018) An emotion recognition model based on facial recognition in virtual learning environment. Procedia Comput Sci 125:2–10. https://doi.org/10.1016/j.procs.2017.12.003
Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vision 57:137–154. https://doi.org/10.1023/B:VISI.0000013087.49260.fb
Dandil E, Ozdemir R (2019) Real time facial emotion classification using deep learning. Int J Data Sci Appl 2:13–17
Sehra K, Rajpal A, Mishra A, Chetty G (2019) Hog based facial recognition approach using viola jones algorithm and extreme learning machine. Computational Science and Its Applications - ICCSA 2019. Springer, Cham, pp 423–435
Lo C, Chow P (2012) A high-performance architecture for training viola-jones object detectors, pp. 174–181. https://doi.org/10.1109/FPT.2012.6412131
Dellaert F, Polzin T, Waibel A (1996) Recognizing emotion in speech. International Conference on Spoken Language Processing, ICSLP, Proceedings 3
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 448–456. PMLR, Lille, France
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection 1:886–893. https://doi.org/10.1109/CVPR.2005.177
Avots E, Sapinski T, Bachmann M, Kaminska D (2019) Audio-visual emotion recognition in wild. Mach Vis Appl. https://doi.org/10.1007/s00138-018-0960-9
Xianzhang P (2020) Fusing hog and convolutional neural network spatial-temporal features for video-based facial expression recognition. IET Image Process. https://doi.org/10.1049/iet-ipr.2019.0293
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
All authors have read and agreed to the published version of the manuscript
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethical approval
not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Aouani, H., Ben Ayed, Y. Deep facial expression detection using Viola-Jones algorithm, CNN-MLP and CNN-SVM. Soc. Netw. Anal. Min. 14, 65 (2024). https://doi.org/10.1007/s13278-024-01231-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-024-01231-y