Three-dimensional DenseNet self-attention neural network for automatic detection of student’s engagement

Mehta, Naval Kishore; Prasad, Shyam Sunder; Saurav, Sumeet; Saini, Ravi; Singh, Sanjay

doi:10.1007/s10489-022-03200-4

Three-dimensional DenseNet self-attention neural network for automatic detection of student’s engagement

Published: 18 March 2022

Volume 52, pages 13803–13823, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Naval Kishore Mehta^1,2,
Shyam Sunder Prasad ORCID: orcid.org/0000-0003-2676-8985^1,2,
Sumeet Saurav^1,2,
Ravi Saini^1,2 &
…
Sanjay Singh^1,2

6522 Accesses
1 Altmetric
Explore all metrics

Abstract

Today, due to the widespread outbreak of the deadly coronavirus, popularly known as COVID-19, the traditional classroom education has been shifted to computer-based learning. Students of various cognitive and psychological abilities participate in the learning process. However, most students are hesitant to provide regular and honest feedback on the comprehensiveness of the course, making it difficult for the instructor to ensure that all students are grasping the information at the same rate. The students’ understanding of the course and their emotional engagement, as indicated via facial expressions, are intertwined. This paper attempts to present a three-dimensional DenseNet self-attention neural network (DenseAttNet) used to identify and evaluate student participation in modern and traditional educational programs. With the Dataset for Affective States in E-Environments (DAiSEE), the proposed DenseAttNet model outperformed all other existing methods, achieving baseline accuracy of 63.59% for engagement classification and 54.27% for boredom classification, respectively. Besides, DenseAttNet trained on all four multi-labels, namely boredom, engagement, confusion, and frustration has registered an accuracy of 81.17%, 94.85%, 90.96%, and 95.85%, respectively. In addition, we performed a regression experiment on DAiSEE and obtained the lowest Mean Square Error (MSE) value of 0.0347. Finally, the proposed approach achieves a competitive MSE of 0.0877 when validated on the Emotion Recognition in the Wild Engagement Prediction (EmotiW-EP) dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 12

Automatic detection of students’ affective states in classroom environment using hybrid convolutional neural networks

Article 28 October 2019

Online learners’ engagement detection via facial emotion recognition in online learning context using hybrid classification model

Article 21 February 2024

AI-Based Student Emotion and Engagement Level Detection Framework

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

https://en.unesco.org/covid19/educationresponse, accessed on 22/06/2021

References

Mahmood S (2021) Instructional strategies for online teaching in covid-19 pandemic. Human Behav Emerg Technol 3(1):199–203
Article Google Scholar
Dias S B, Hadjileontiadou S J, Diniz J, Hadjileontiadis L J (2020) Deeplms: a deep learning predictive model for supporting online learning in the covid-19 era. Sci Rep 10(1):1–17
Article Google Scholar
Singh V, Thurman A (2019) How many ways can we define online learning? a systematic literature review of definitions of online learning (1988-2018). Am J Dist Educ 33(4):289–306
Article Google Scholar
Adnan M, Anwar K (2020) Online learning amid the covid-19 pandemic: Students’ perspectives. Online Submiss 2(1):45–51
Google Scholar
Dhawan S (2020) Online learning: A panacea in the time of covid-19 crisis. J Educ Technol Syst 49(1):5–22
Article Google Scholar
Lan M, Hew K F (2020) Examining learning engagement in moocs: A self-determination theoretical perspective using mixed method. Int J Educ Technol Higher Educ 17(1):1–24
Article Google Scholar
Kuzilek J, Hlosta M, Herrmannova D, Zdrahal Z, Vaclavek J, Wolff A (2015) Ou analyse: analysing at-risk students at the open university. Learn Anal Rev:1–16
Dewan M A A, Murshed M, Lin F (2019) Engagement detection in online learning: a review. Smart Learn Environ 6(1):1–20
Article Google Scholar
Hussain M, Zhu W, Zhang W, Abidi S M R (2018) Student engagement predictions in an e-learning system and their impact on student course assessment scores. Comput Intell Neurosci
Pietarinen J, Soini T, Pyhältö K (2014) Students’ emotional and cognitive engagement as the determinants of well-being and achievement in school. Int J Educ Res 67:40–51
Article Google Scholar
Pilotti M, Anderson S, Hardy P, Murphy P, Vincent P (2017) Factors related to cognitive, emotional, and behavioral engagement in the online asynchronous classroom. Int J Teach Learn Higher Educ 29 (1):145–153
Google Scholar
Craig S, Graesser A, Sullins J, Gholson B (2004) Affect and learning: an exploratory look into the role of affect in learning with autotutor. J Educ Media 29(3):241–250
Article Google Scholar
Jung Y, Lee J (2018) Learning engagement and persistence in massive open online courses (moocs). Comput Educ 122:9–22
Article Google Scholar
Kushwaha R C, Singhal A, Chaurasia P K (2015) Study of students’ performance in learning management system. Int J Contempor Res Comput Sci Technol (IJCRCST) 1(6):213–217
Google Scholar
Wang M-T, Willett J B, Eccles J S (2011) The assessment of school engagement: Examining dimensionality and measurement invariance by gender and race/ethnicity. J Sch Psychol 49(4):465–480
Article Google Scholar
Bartlett M S, Littlewort G, Frank M, Lainscsek C, Fasel I, Movellan J (2005) Recognizing facial expression: machine learning and application to spontaneous behavior. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol 2. IEEE, pp 568–573
Guo Y, Tao D, Yu J, Xiong H, Li Y, Tao D (2016) Deep neural networks with relativity learning for facial expression recognition. In: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, pp 1–6
Saurav S, Saini R, Singh S (2021) Emnet: a deep integrated convolutional neural network for facial emotion recognition in the wild. Appl Intell:1–28
Yu Z, Zhang C (2015) Image based static facial expression recognition with multiple deep network learning. In: Proceedings of the 2015 ACM on international conference on multimodal interaction, pp 435–442
Calvo R A, D’Mello S (2010) Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Trans Affect Comput 1(1):18–37
Article Google Scholar
Gupta A, D’Cunha A, Awasthi K, Balasubramanian V (2016) Daisee: Towards user engagement recognition in the wild. arXiv:1609.01885
Whitehill J, Serpell Z, Foster A, Lin Y-C, Pearson B, Bartlett M, Movellan J (2011) Towards an optimal affect-sensitive instructional system of cognitive skills. In: CVPR 2011 WORKSHOPS. IEEE, pp 20–25
Grafsgaard J, Wiggins J B, Boyer K E, Wiebe E N, Lester J (2013) Automatically recognizing facial expression: Predicting engagement and frustration. In: Educational Data Mining 2013
Bosch N, D’Mello S, Baker R, Ocumpaugh J, Shute V, Ventura M, Wang L, Zhao W (2015) Automatic detection of learning-centered affective states in the wild. In: Proceedings of the 20th international conference on intelligent user interfaces, pp 379–388
Kamath A, Biswas A, Balasubramanian V (2016) A crowdsourced approach to student engagement recognition in e-learning environments. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp 1–9
Monkaresi H, Bosch N, Calvo R A, D’Mello S K (2016) Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Trans Affect Comput 8(1):15–28
Article Google Scholar
Huang T, Mei Y, Zhang H, Liu S, Yang H (2019) Fine-grained engagement recognition in online learning environment. In: 2019 IEEE 9th international conference on electronics information and emergency communication (ICEIEC). IEEE, pp 338–341
Liao J, Liang Y, Pan J (2021) Deep facial spatiotemporal network for engagement prediction in online learning. Appl Intell:1–13
Wang Y, Kotha A, Hong P-, Qiu M (2020) Automated student engagement monitoring and evaluation during learning in the wild. In: 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom). IEEE, pp 270–275
Zhang H, Xiao X, Huang T, Liu S, Xia Y, Li J (2019) An novel end-to-end network for automatic student engagement recognition. In: 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC). IEEE, pp 342–345
Zhang S, Pan X, Cui Y, Zhao X, Liu L (2019) Learning affective video features for facial expression recognition via hybrid deep learning. IEEE Access 7:32297–32304
Article Google Scholar
Saurav S, Gidde P, Saini R, Singh S (2021) Dual integrated convolutional neural network for real-time facial expression recognition in the wild. Vis Comput:1–14
Yang J, Wang K, Peng X, Qiao Y (2018) Deep recurrent multi-instance learning with spatio-temporal features for engagement intensity prediction. In: Proceedings of the 20th ACM international conference on multimodal interaction, pp 594–598
Murshed M, Dewan M A A, Lin F, Wen D (2019) Engagement detection in e-learning environments using convolutional neural networks. In: 2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). IEEE, pp 80–86
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Uemura T, Näppi J J, Hironaka T, Kim H, Yoshida H (2020) Comparative performance of 3d-densenet, 3d-resnet, and 3d-vgg models in polyp detection for ct colonography. In: Medical Imaging 2020: Computer-Aided Diagnosis, vol 11314. International Society for Optics and Photonics, p 1131435
Hara K, Kataoka H, Satoh Y (2018) Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet?. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 6546–6555
Ruiz J, Mahmud M, Modasshir M, Kaiser M S, Alzheimer’s Disease Neuroimaging Initiative ft, et al. (2020) 3d densenet ensemble in 4-way classification of alzheimer’s disease. In: International Conference on Brain Informatics. Springer, pp 85–96
Huang G, Liu Z, Van Der Maaten L, Weinberger K Q (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Zhang Z, Sabuncu M R (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: 32nd Conference on Neural Information Processing Systems (NeurIPS)
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Dhall A (2019) Emotiw 2019: Automatic emotion, engagement and cohesion prediction tasks. In: 2019 International Conference on Multimodal Interaction, pp 546–550
Dhall A, Kaur A, Goecke R, Gedeon T (2018) Emotiw 2018: Audio-video, student engagement and group-level affect prediction. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp 653–656
Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Lim R, MJT Reinders T (2000) Facial landmark detection using a gabor filter representation and a genetic search algorithm. In: PROCEEDING,(SITIA’2000), GRAHA INSTITUT TEKNOLOGI SEPULUH NOPEMBER. Citeseer
Sathik M, Jonathan S G (2013) Effect of facial expressions on student’s comprehension recognition in virtual educational environments. SpringerPlus 2(1):1–9
Article Google Scholar
Liu P, Lin Y, Meng Z, Lu L, Deng W, Zhou J T, Yang Y (2021) Point adversarial self-mining: A simple method for facial expression recognition. IEEE Transactions on Cybernetics
Tonguç G, Ozkara B O (2020) Automatic recognition of student emotions from facial expressions during a lecture. Comput Educ 148:103797
Article Google Scholar
Bhardwaj P, Gupta PK, Panwar H, Siddiqui M K, Morales-Menendez R, Bhaik A (2021) Application of deep learning on student engagement in e-learning environments. Comput Electr Eng 93:107277
Article Google Scholar
Pan M, Wang J, Luo Z (2018) Modelling study on learning affects for classroom teaching/learning auto-evaluation. Science 6(3):81–86
Google Scholar
Thomas C (2018) Multimodal teaching and learning analytics for classroom and online educational settings. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp 542–545
El Kerdawy M, El Halaby M, Hassan A, Maher M, Fayed H, Shawky D, Badawi A (2020) The automatic detection of cognition using eeg and facial expressions. Sensors 20(12):3516
Article Google Scholar
Hu X, Chen J, Wang F, Zhang D (2019) Ten challenges for eeg-based affective computing. Brain Sci Adv 5(1):1–20
Article Google Scholar
Khedher A B, Jraidi I, Frasson C, et al. (2019) Tracking students’ mental engagement using eeg signals during an interaction with a virtual learning environment. J Intell Learn Syst Appl 11(01):1
Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625–2634
Geng L, Xu M, Wei Z, Zhou X (2019) Learning deep spatiotemporal feature for engagement recognition of online courses. In: 2019 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, pp 442–447
Niu X, Han H, Zeng J, Sun X, Shan S, Huang Y, Yang S, Chen X (2018) Automatic engagement prediction with gap feature. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp 599–603
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Xu T, Zhang P, Huang Q, Zhang H, Gan Z, Huang X, He X (2018) Attngan: Fine-grained text to image generation with attentional generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1316–1324
Hoogi A, Wilcox B, Gupta Y, Rubin D L (2019) Self-attention capsule networks for object classification. arXiv:1904.12483
Li M, Hsu W, Xie X, Cong J, Gao W (2020) Sacnn: Self-attention convolutional neural network for low-dose ct denoising with self-supervised perceptual loss network. IEEE Trans Med Imaging 39 (7):2289–2301
Article Google Scholar
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning. PMLR, pp 7354–7363
Zhang X, Han L, Zhu W, Sun L, Zhang D (2021) An explainable 3d residual self-attention deep neural network for joint atrophy localization and alzheimer’s disease diagnosis using structural mri. IEEE Journal of Biomedical and Health Informatics
Drummond C, Holte R C, et al. (2003) C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on learning from imbalanced datasets II, vol 11. Citeseer, pp 1–8
Huang C, Li Y, Loy C C, Tang X (2016) Learning deep representation for imbalanced classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5375–5384
Khan S H, Hayat M, Bennamoun M, Sohel F A, Togneri R (2017) Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans Neural Netw Learn Syst 29(8):3573–3587
Google Scholar
Cui Y, Jia M, Lin T-Y, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9268–9277
Wang L, Wang C, Sun Z, Cheng S, Guo L (2020) Class balanced loss for image classification. IEEE Access 8:81142–81153
Article Google Scholar
Saurav S, Saini R, Singh S (2021) A dual-stream fused neural network for fall detection in multi-camera and 360^∘ videos. Neural Comput Appl:1–28
Grandini M, Bagli E, Visani G (2020) Metrics for multi-class classification: an overview. arXiv:2008.05756
Bottou L (2012) Stochastic gradient descent tricks. In: Neural networks: Tricks of the trade. Springer, pp 421–436

Download references

Acknowledgements

The authors would like to thank the Director, CSIR-CEERI, Pilani, India for supporting and encouraging research activities at CSIR-CEERI, Pilani. We also thank Kashish Sapra of CSIR-CEERI, Pilani for proofreading.

Author information

Authors and Affiliations

Academy of Scientific and Innovative Research(AcSIR), Ghaziabad, India
Naval Kishore Mehta, Shyam Sunder Prasad, Sumeet Saurav, Ravi Saini & Sanjay Singh
CSIR-Central Electronics Engineering Research Institute(CSIR-CEERI), Pilani, India
Naval Kishore Mehta, Shyam Sunder Prasad, Sumeet Saurav, Ravi Saini & Sanjay Singh

Authors

Naval Kishore Mehta
View author publications
You can also search for this author inPubMed Google Scholar
Shyam Sunder Prasad
View author publications
You can also search for this author inPubMed Google Scholar
Sumeet Saurav
View author publications
You can also search for this author inPubMed Google Scholar
Ravi Saini
View author publications
You can also search for this author inPubMed Google Scholar
Sanjay Singh
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Shyam Sunder Prasad.

Ethics declarations

Conflict of Interest

The authors declare that we have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Artificial Intelligence Applications for COVID-19, Detection, Control, Prediction, and Diagnosis

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mehta, N.K., Prasad, S.S., Saurav, S. et al. Three-dimensional DenseNet self-attention neural network for automatic detection of student’s engagement. Appl Intell 52, 13803–13823 (2022). https://doi.org/10.1007/s10489-022-03200-4

Download citation

Accepted: 04 January 2022
Published: 18 March 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s10489-022-03200-4

Keywords

Part of a collection:

Artificial Intelligence Applications for COVID-19, Detection, Control, Prediction, and Diagnosis

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Three-dimensional DenseNet self-attention neural network for automatic detection of student’s engagement

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic detection of students’ affective states in classroom environment using hybrid convolutional neural networks

Online learners’ engagement detection via facial emotion recognition in online learning context using hybrid classification model

AI-Based Student Emotion and Engagement Level Detection Framework

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now