Real emotion seeker: recalibrating annotation for facial expression recognition | Multimedia Systems Skip to main content
Log in

Real emotion seeker: recalibrating annotation for facial expression recognition

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Facial expression recognition (FER) is a challenging classification task. Due to the subjectivity and ambiguity of performers and spectators, compound facial expression is hard to be represented by one-hot label. In this paper, a simple but efficient method, named real emotion seeker (RES), is proposed to recalibrate the annotation of sample to latent expression distribution besides one-hot label. In particular, subjective implicit knowledge is transformed into posterior distribution which is specific to each FER data set through Bayesian inference, thus enhancing universality and authenticity. The posterior distribution is then combined with one-hot label to form the recalibrated annotation as an additional supervision, guiding the prediction more realistic. Our proposed method is independent of the backbone network and can improve the accuracy significantly by an average of 3.16% with no burden for training and inference. Extensive experiments show that RES can obtain consistent prediction with human subjective intuition. Results on three in-the-wild data sets demonstrate that our approach achieves advanced results with 90.38% on RAF-DB, 90.34% on FERPlus and 62.63% on AffectNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Mehrabian, A.: Communication without words. Commun. Theory 6, 193–200 (2008)

    Google Scholar 

  2. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-workshops, pp. 94–101 (2010). IEEE

  3. Valstar, M., Pantic, M.: Induced disgust, happiness and surprise: an addition to the mmi facial expression database. In: Proceedings of 3rd Internatinal Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect, p. 65. Paris, France (2010)

  4. Zhao, G., Huang, X., Taini, M., Li, S.Z., PietikäInen, M.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011)

    Article  Google Scholar 

  5. Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2584–2593. IEEE (2017)

  6. Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)

  7. Mollahosseini, A., Hasani, B., Mahoor, M.H.: Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10(1), 18–31 (2017)

    Article  Google Scholar 

  8. Plutchik, R.: A general psychoevolutionary theory of emotion. In: Theories of Emotion, pp. 3–33. Elsevier (1980)

  9. Geng, X.: Label distribution learning. IEEE Trans. Knowl. Data Eng 28(7), 1734–1748 (2016)

    Article  Google Scholar 

  10. Zhou, Y., Xue, H., Geng, X.: Emotion distribution recognition from facial expressions. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 1247–1250 (2015)

  11. Chen, S., Wang, J., Chen, Y., Shi, Z., Geng, X., Rui, Y.: Label distribution learning on auxiliary label space graphs for facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13984–13993 (2020)

  12. She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6248–6257 (2021)

  13. Vo, T.-H., Lee, G.-S., Yang, H.-J., Kim, S.-H.: Pyramid with super resolution for in-the-wild facial expression recognition. IEEE Access 8, 131988–132001 (2020)

    Article  Google Scholar 

  14. Ghimire, D., Lee, J.: Geometric feature-based facial expression recognition in image sequences using multi-class adaboost and support vector machines. Sensors 13(6), 7714–7734 (2013)

    Article  Google Scholar 

  15. Happy, S., George, A., Routray, A.: A real time facial expression classification system using local binary patterns. In: 2012 4th International Conference on Intelligent Human Computer Interaction (IHCI), pp. 1–5. IEEE (2012)

  16. Fabian Benitez-Quiroz, C., Srinivasan, R., Martinez, A.M.: Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016)

  17. Ding, H., Zhou, P., Chellappa, R.: Occlusion-adaptive deep network for robust facial expression recognition. In: 2020 IEEE International Joint Conference on Biometrics (IJCB), pp. 1–9. IEEE (2020)

  18. Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)

    Article  MATH  Google Scholar 

  19. Song, L., Gong, D., Li, Z., Liu, C., Liu, W.: Occlusion robust face recognition based on mask learning with pairwise differential siamese network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 773–782 (2019)

  20. Zhang, F., Zhang, T., Mao, Q., Xu, C.: A unified deep model for joint facial expression recognition, face synthesis, and face alignment. IEEE Trans. Image Process. 29, 6574–6589 (2020). https://doi.org/10.1109/TIP.2020.2991549

    Article  MATH  Google Scholar 

  21. Zhang, F., Zhang, T., Mao, Q., Xu, C.: Geometry guided pose-invariant facial expression recognition. IEEE Trans. Image Process. 29, 4445–4460 (2020). https://doi.org/10.1109/TIP.2020.2972114

    Article  MATH  Google Scholar 

  22. Zhang, X., Zhang, F., Xu, C.: Joint expression synthesis and representation learning for facial expression recognition. IEEE Trans. Circuits Syst. Video Technol. 32(3), 1681–1695 (2022). https://doi.org/10.1109/TCSVT.2021.3056098

    Article  Google Scholar 

  23. Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)

  24. Xue, F., Wang, Q., Guo, G.: Transfer: Learning relation-aware facial expression representations with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3601–3610 (2021)

  25. Geng, X., Xia, Y.: Head pose estimation based on multivariate label distribution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1837–1842 (2014)

  26. Su, K., Geng, X.: Soft facial landmark detection by label distribution learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 5008–5015 (2019)

  27. Smith-Miles, K., Geng, X.: Revisiting facial age estimation with new insights from instance space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 44, 2689–2697 (2022)

    Article  Google Scholar 

  28. Zhang, H., Zhang, Y., Geng, X.: Practical age estimation using deep label distribution learning. Front. Comput. Sci. 15(3), 1–6 (2021)

    Article  Google Scholar 

  29. Xu, N., Liu, Y.-P., Geng, X.: Label enhancement for label distribution learning. IEEE Trans. Knowl. Data Eng. 33, 1632–1643 (2021)

    Article  Google Scholar 

  30. Xu, N., Shu, J., Liu, Y., Geng, X.: Variational label enhancement. In: International Conference on Machine Learning, pp. 10597–10606. PMLR (2020)

  31. Li, Y., Zhang, M., Geng, X.: Leveraging implicit relative labeling-importance information for effective multi-label learning. In: 2015 IEEE International Conference on Data Mining, pp. 251–260. IEEE (2015)

  32. Hou, P., Geng, X., Zhang, M.: Multi-label manifold learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)

  33. Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: Mixmatch: a holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249 (2019)

  34. Li, J., Socher, R., Hoi, S.C.H.: Dividemix: Learning with noisy labels as semi-supervised learning. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=HJgExaVtwr

  35. Wu, G., Gong, S.: Peer collaborative learning for online knowledge distillation. In: AAAI (2021)

  36. Roschelle, J.: Learning in Interactive Environments: Prior Knowledge and New Experience. Princeton, Citeseer (1997)

    Google Scholar 

  37. Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.-H., et al.: Challenges in representation learning: a report on three machine learning contests. In: International Conference on Neural Information Processing, pp. 117–124. Springer (2013)

  38. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. (2014)

  39. van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  40. Farzaneh, A.H., Qi, X.: Discriminant distribution-agnostic loss for facial expression recognition in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 406–407 (2020)

  41. Farzaneh, A.H., Qi, X.: Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2402–2411 (2021)

  42. Zhao, Z., Liu, Q., Wang, S.: Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Trans. Image Process. 30, 6544–6556 (2021)

    Article  Google Scholar 

  43. Ruan, D., Yan, Y., Lai, S., Chai, Z., Shen, C., Wang, H.: Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7660–7669 (2021)

  44. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  45. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China under Grant 62071216 and U1936202.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiu Shen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, Z., She, J. & Shen, Q. Real emotion seeker: recalibrating annotation for facial expression recognition. Multimedia Systems 29, 139–151 (2023). https://doi.org/10.1007/s00530-022-00986-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-00986-8

Keywords

Navigation