A novel multi-scale facial expression recognition algorithm based on improved Res2Net for classroom scenes

Gu, Meihua; Feng, Jing; Chu, Yalu

doi:10.1007/s11042-023-16115-0

A novel multi-scale facial expression recognition algorithm based on improved Res2Net for classroom scenes

Published: 15 July 2023

Volume 83, pages 16525–16542, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

297 Accesses
1 Altmetric
Explore all metrics

Abstract

Facial expression recognition under classroom scenes can help the teacher to understand students’ classroom learning status and improve teaching effectiveness. Aiming at the problem of low expression recognition accuracy in classroom scenarios, a novel multi-scale facial expression recognition algorithm based on improved Res2Net is proposed. Firstly, a bi-directional residual BiRes2Net module is proposed to achieve bi-directional multi-scale expression feature extraction at the fine-grained level, while a short-directed connection path is introduced to make the network have the self-closing capability and avoid extracting redundant information of expressions; Then the Fine-Grained Coordinate Attention (FGCA) mechanism is embedded to extract expression spatial location features and channel features at a fine-grained level by making full use of the prior knowledge of facial expressions; Finally, a multi-classification Focalloss loss function is used to alleviate the imbalance of expression data, and different weights are assigned to expression samples with different recognition difficulty so that the network is biased towards difficult sample feature extraction. The experimental results show that the recognition accuracy of the proposed method is 79.47%, 94.06%, and 96.67% in RAF-DB, JAFFE, and CK+ datasets respectively, and up to 72.71% in real classroom scenes, which are better than other comparative algorithms significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Design of a Fast Recognition Method for College Students’ Classroom Expression Images Based on Deep Learning

Facial expression recognition based on strong attention mechanism and residual network

Article 28 September 2022

Research on real-time teachers’ facial expression recognition based on YOLOv5 and attention mechanisms

Article Open access 13 May 2023

Data availability statements

The classroom scenario dataset we created is available from the corresponding author on reasonable request. The access links of the publicly available dataset are as follows:

RAF-DB: http://www.whdeng.cn/RAF/model1.html

JAFFE: http://www.kasrl.org/jaffe.html

CK+: http://www.consortium.ri.cmu.edu/ckagree/

Abbreviations

BiRes2Net:: Bi-directional residual Res2Net Module
FGCA:: Fine-Grained Coordinate Attention
RAF-DB:: Real-world Affective Faces Database
JAFFE:: The Japanses Female Facial Expression Database
CK +:: The Extended Cohn-Kanade Dataset.
EMFACS:: Emotional Facial Action Coding System
SCN:: Self-Cure Convolutional Neural Network
ICID:: Inter-Domain Facial Expression Recognition Feature Fusion Network
IC:: Intra-category Common feature
ID:: Inter-category Distinction feature
FDRL:: Feature Decomposition and Reconstruction Learning
FDN:: Feature Decomposition Network
FRN:: Feature Reconstruction Network
DLP-CNN:: Deep locality-preserving CNN
DMFA-ResNet:: deep multiscale fusion attention residual network
CERT:: the Computer Expression Recognition Toolbox
SE:: Squeeze-and-Excitation
CA:: Coordinate Attention
NE:: Natural
DI:: Disgust
FE:: Fear
AN:: Anger
HA:: Happiness

References

Dimitrios K, Viktoriia S, Stefanos Z. (2021) Distribution matching for heterogeneous multi-task learning: a large-scale face study. Proceedings of the IEEE Conference on Computer vision and Pattern Recognition(CVPR), https://doi.org/10.48550/arXiv.2015.03790
Gao SH, Cheng MM, Zhao K et al (2019) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662. https://doi.org/10.1109/tpami.2019.2938758
Article Google Scholar
Gao T, Zhaochen Y, Ting C et al (2021) Deep multi-scale fusion attention residual face expression recognition network[J]. J Intell Syst 17(2):393–401. https://doi.org/10.11992/tis.202107028
Article Google Scholar
Gupta SK, Ashwin TS, Guddeti RMR (2019) Students' affective content analysis in smart classroom environment using deep learning techniques. Multimed Tools Appl 78(18):25321–25348. https://doi.org/10.1007/s11042-019-7651-z
Article Google Scholar
Hou Q, Zhou D, Feng J. (2021) Coordinate attention for efficient mobile network design. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), pp 13713-13722. https://doi.org/10.1109/cvpr46437.2021.01350
Hu J, Shen L, Sun G (2018) Squeeze-and-Excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 7132–7141. https://doi.org/10.1109/cvpr.2018.00745
Ji Y, Hu Y, Yang Y et al (2019) Cross-domain facial expression recognition via an intra-category common feature and inter-category distinction feature fusion network. Neurocomput 333:231–239. https://doi.org/10.1016/j.neucom.2018.12.037
Article Google Scholar
Li, D (2021) Research on facial expression recognition based on capsule network. Southwest University. https://doi.org/10.27684/d.cnki.gxndx.2021.003154
Li S, Deng W (2019) Reliable crowdsourcing and deep locality preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(01):356–370. https://doi.org/10.1109/tip.2018.2868382
Article MathSciNet Google Scholar
Li S, Deng W, Du J P (2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 2852-2861. https://doi.org/10.1109/cvpr.2017.277
Li Y, Zeng J, Shan S et al (2018) Occlusion aware facial expression recognition using cnn with attention mechanism. IEEE Trans Image Process 28(05):2439–2450. https://doi.org/10.1109/tip.2018.2886767
Article MathSciNet Google Scholar
Lin TY, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. Proceed IEEE Int Conf Comput Vis:2980–2988. https://doi.org/10.1109/iccv.2017.324
Lucey, P, Cohn, JF, Kanade, T, et al (2010) The extended cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-workshops, IEEE, pp 94–101. https://doi.org/10.1109/cvprw.2010.5543262
Lyons, M, Akamatsu, S, Kamachi, M, Gyoba, J (1998) Coding facial expressions with gabor wavelets. In Proceedings Third IEEE international conference on automatic face and gesture recognition, IEEE, pp. 200–205. https://doi.org/10.1109/AFGR.1998.670949
Minaee S, Minaei M, Abdolrashidi A (2021) Deep-emotion: facial expression recognition using attentional convolutional network. Sensors. 21(9):3046. https://doi.org/10.3390/s21093046
Article Google Scholar
Li Peng (2020) Research on an end-to-end student emotion recognition system to assist university classroom teaching. University of Electronic Science and Technology. https://doi.org/10.27005/d.cnki.gdzku.2020.003411
Radlak K, Smolka B (2016) High dimensional local binary patterns for facial expression recognition in the wild. Mediterranean Electrotechnical Conference(MELECON), pp. 1–5. https://doi.org/10.1109/melcon.2016.7495381
Renneberg B, Heyn K, Gebhard R et al (2005) Facial expression of emotions in borderline personality disorder and depression. J Behav Ther Exp Psychiatry 36(03):183–196. https://doi.org/10.1016/j.jbtep.2005.05.002
Article Google Scholar
Ruan D, Yan Y, Lai S, et al (2021) Feature decomposition and reconstruction learning for effective facial expression recognition. IEEE/CVF conference on computer vision and pattern recognition(CVPR), pp 7660-7669. https://doi.org/10.1109/cvpr46437.2021.00757
Selvaraju R R, Cogswell M, Das A, et al (2017) Grade-cam: visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision(ICCV), pp 618-626. https://doi.org/10.1109/iccv.2017.74
Sherly Alphonse A, Dharma D (2017) A novel monogenic directional pattern and pseudo-Voigt kernel for facilitating the identification of facial emotions. J Vis Commun Image Represent 49(03):459–470. https://doi.org/10.1016/j.jvcir.2017.10.008
Article Google Scholar
Song Y, Gao S, Zeng H, et al (2021) Multi-scale depth-separable expression recognition with embedded attention mechanism. J Beijing Univ Aeronaut Astronaut https://doi.org/10.13700/j.bh.1001-5965.2021.0114
Stewart A, Bosch N, Chen H, et al (2017) Face forward: detecting mind wandering from video during narrative film comprehension. International conference on artificial intelligence, pp 359-370. https://doi.org/10.1007/978-3-319-61425-0_30
Su C, Wang L, Lan VJ (2021) A fine-grained expression recognition model based on multi-scale hierarchical bilinear pooling network. Comput Eng 47(12):299–307. https://doi.org/10.19678/j.issn.1000-3428.0060133
Article Google Scholar
Sun Y, Wen G (2017) Cognitive facial expression recognition with constrained dimensionality reduction. Neurocomput 100(230):397–408. https://doi.org/10.1016/j.neucom.2016.12.043
Article Google Scholar
Sun W, Zhao H, Jin Z (2018) A visual attention based ROI detection method for facial expression recognition. Neurocomputing 296(01):12–22. https://doi.org/10.1016/j.neucom.2018.03.034
Article Google Scholar
Tan M, Pang R, Le Q V (2020) Efficientdet: scalable and efficient object detection. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), pp 10781-10790. https://doi.org/10.1109/cvpr42600.2020.01079
van der Maaten L, Hinton G (2008) Visualizing Data using t-SNE. J Mach Learn Res 9:2579–2605
Google Scholar
Vemulapalli R, Agarwala A (2019) A compact embedding for facial expression similarity. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), pp 5683-5692. https://doi.org/10.1109/cvpr.2019.00583
Wang K, Peng X, Yang J, et al (2020) Suppressing uncertainties for large-scale facial expression recognition. IEEE/CVF conference on computer vision and pattern recognition(CVPR), pp 6897-6906. https://doi.org/10.1109/cvpr42600.2020.00693
Whitehill J, Serpell Z, Lin YC et al (2014) The faces of engagement: automatic recognition of student engagement from facial expressions. IEEE Trans Affect Comput 5(01):86–98. https://doi.org/10.1109/taffc.2014.2316163
Article Google Scholar
Yao L, Wan Y, Ni H, Xu B; (2021) Action unit classification for facial expression recognition using active learning and SVM . Multimed Tools Appl, https://doi.org/10.1007/s11042-021-10836-w
Yongqiang LV (2021) Research on face expression recognition in natural scenes. Huazhong Normal University. https://doi.org/10.27159/d.cnki.ghzsu.2021.002034
Yu Z (2018) Emotion recognition based on small resolution faces and its application in information-based teaching. Shanghai Jiaotong University. https://doi.org/10.27307/d.cnki.gsjtu.2018.004755
Zhang P, Kong W, Teng J (2022) Face expression recognition based on multi-scale feature attention mechanism. Comput Eng Appl 58(01):182–189. https://doi.org/10.19304/j.issn1000-7180.2021.0799
Article Google Scholar
Zhu R, Sang G, Zhao Q (2016) Discriminative feature adaptation for cross-domain facial expression recognition. 2016 international conference on biometrics (ICB), IEEE, pp 1-7. https://doi.org/10.1109/icb.2016.7550085

Download references

Acknowledgements

This work was supported in part by Postgraduate Innovation Fund Project of Xi’an Polytechnic University (chx2022012).

Author information

Authors and Affiliations

School of Electronics Information, Xi’an Polytechnic University, Xi’an, 710048, China
Meihua Gu, Jing Feng & Yalu Chu

Authors

Meihua Gu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Feng
View author publications
You can also search for this author in PubMed Google Scholar
Yalu Chu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Meihua Gu and Jing Feng designed the research, performed the research, Yalu Chu analyzed the data, all authors contributed to the writing and revisions.

Corresponding author

Correspondence to Meihua Gu.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Compliance with ethical standards

Informed consentᅟ

Consent for publication

Not applicable.

The ethics agreement

Code of Ethics for Socio-Economic Research and Declaration of Helsinki.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gu, M., Feng, J. & Chu, Y. A novel multi-scale facial expression recognition algorithm based on improved Res2Net for classroom scenes. Multimed Tools Appl 83, 16525–16542 (2024). https://doi.org/10.1007/s11042-023-16115-0

Download citation

Received: 05 October 2022
Revised: 20 May 2023
Accepted: 26 June 2023
Published: 15 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-16115-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A novel multi-scale facial expression recognition algorithm based on improved Res2Net for classroom scenes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Design of a Fast Recognition Method for College Students’ Classroom Expression Images Based on Deep Learning

Facial expression recognition based on strong attention mechanism and residual network

Research on real-time teachers’ facial expression recognition based on YOLOv5 and attention mechanisms

Data availability statements

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Compliance with ethical standards

Consent for publication

The ethics agreement

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A novel multi-scale facial expression recognition algorithm based on improved Res2Net for classroom scenes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Design of a Fast Recognition Method for College Students’ Classroom Expression Images Based on Deep Learning

Facial expression recognition based on strong attention mechanism and residual network

Research on real-time teachers’ facial expression recognition based on YOLOv5 and attention mechanisms

Data availability statements

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Compliance with ethical standards

Consent for publication

The ethics agreement

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation