Style transformed synthetic images for real world gaze estimation by using residual neural network with embedded personal identities

Wang, Quan; Wang, Hui; Dang, Ruo-Chen; Zhu, Guang-Pu; Pi, Hai-Feng; Shic, Frederick; Hu, Bing-liang

doi:10.1007/s10489-022-03481-9

Style transformed synthetic images for real world gaze estimation by using residual neural network with embedded personal identities

Published: 04 May 2022

Volume 53, pages 2026–2041, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Quan Wang ORCID: orcid.org/0000-0001-6086-4191^1,2,
Hui Wang^1,2,3,
Ruo-Chen Dang^1,2,
Guang-Pu Zhu^1,2,3,
Hai-Feng Pi^1,2,
Frederick Shic⁴ &
…
Bing-liang Hu^1,2

617 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Gaze interaction is essential for social communication in many scenarios; therefore, interpreting people’s gaze direction is helpful for natural human-robot interactions and human-virtual characters. In this study, we first adopt a residual neural network (ResNet) structure with an embedding layer of personal identity (ID-ResNet) that outperformed the current best result of 2.51^∘ with MPIIGaze data, a benchmark dataset for gaze estimation. To avoid using manually labelled data, we used UnityEye synthetic images with and without style transformation as the training data. We exceeded the previously reported best result with MPIIGaze data (from 2.76^∘ to 2.55^∘) and UT-Multiview data (from 4.01^∘ to 3.40^∘). In addition, it only needs to fine-tune with a few ”calibration” examples for a new person to yield significant performance gains. In addition, we presented the KLBS-eye dataset that contains 15,350 images collected from 12 participants while looking in nine known directions and received the state-of-the-art result of (0.59 ± 1.69^∘).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments

BCNet: Binocular Cooperative Network for Gaze Estimation

EM-Gaze: eye context correlation and metric learning for gaze estimation

Article Open access 05 May 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Majaranta P, Bulling A (2014) Eye tracking and eye-based human–computer interaction. Springer, Berlin, pp 39–65
Google Scholar
Sugano Y, Zhang X, Bulling A (2016) Aggregaze: Collective estimation of audience attention on public displays. In: Symposium on user interface software & technology
Ali A, Kim Y-G (2020) Deep fusion for 3d gaze estimation from natural face images using multi-stream cnns. IEEE Access 8:69212–69221. https://doi.org/10.1109/ACCESS.2020.2986815
Article Google Scholar
Peréz A, Córdoba ML, Garcia A, Méndez R, Munoz M, Pedraza JL, Sanchez F (2003) A precise eye-gaze detection and tracking system
Young D, Tunley H, Samuels R (1995) Specialised hough transform and active contour methods for real-time eye tracking. University of Sussex, Cognitive and Computing Science, Technical Report 386
Guestrin ED, Eizenman M (2006) General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Transactions on Biomedical Engineering 53(6):1124–1133. https://doi.org/10.1109/TBME.2005.863952
Article Google Scholar
Tabernero J, Benito A, Alcón E, Artal P (2007) Mechanism of compensation of aberrations in the human eye. JOSA A 24(10):3274–3283. https://doi.org/10.1364/josaa.24.003274
Article Google Scholar
Sandborn WJ, Loftus EV Jr, Colombel JF, Fleming KA, Seibold F, Homburger HA, Sendid B, Chapman RW, Tremaine WJ, Kaul DK et al (2001) Evaluation of serologic disease markers in a population-based cohort of patients with ulcerative colitis and crohn’s disease. Inflammatory Bowel Diseases 7(3):192–201. https://doi.org/10.1097/00054725-200108000-00003
Article Google Scholar
Sirohey S, Rosenfeld A, Duric Z (2002) A method of detecting and tracking irises and eyelids in video. Pattern Recogn 35(6):1389–1401. https://doi.org/10.1016/S0031-3203(01)00116-9
Article MATH Google Scholar
Zhang X, Sugano Y, Fritz M, Bulling A (2017) Mpiigaze: Real-world dataset and deep appearance-based gaze estimation. IEEE Trans Pattern Anal Mach Intell PP(99):1–1. https://doi.org/10.1109/TPAMI.2017.2778103
Article Google Scholar
Fischer T, Jin Chang H, Demiris Y (2018) Rt-gene: Real-time eye gaze estimation in natural environments. In: Proceedings of the European conference on computer vision (ECCV), pp 334–352
Lu F, Okabe T, Sugano Y, Sato Y (2011) A head pose-free approach for appearance-based gaze estimation. In: BMVC, pp 1–11, DOI https://doi.org/10.5244/C.25.126, (to appear in print)
Funes Mora KA, Odobez J-M (2012) Gaze estimation from multimodal kinect data. In: IEEE Conference in computer vision and pattern recognition, workshop on gesture recognition
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE international conference on computer vision, pp 1501–1510
Sugano Y, Matsushita Y, Sato Y (2014) Learning-by-synthesis for appearance-based 3d gaze estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1821–1828
Mora KAF, Monay F, Odobez JM (2014) Eyediap: A database for the development and evaluation of gaze estimation algorithms from rgb and rgb-d cameras
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Wood E, Baltrusaitis T, Zhang X, Sugano Y, Robinson P, Bulling A (2015) Rendering of eyes for eye-shape registration and gaze estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3756–3764
Zhang X, Sugano Y, Fritz M, Bulling A (2015) Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4511–4520
Zhang X, Park S, Beeler T, Bradley D, Tang S, Hilliges O (2020) Eth-xgaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation. In: European conference on computer vision. Springer, pp 365–381
Lemley J, Kar A, Drimbarean A, Corcoran P (2019) Convolutional neural network implementation for eye-gaze estimation on low-quality consumer imaging systems. IEEE Trans Consum Electron 65 (2):179–187. https://doi.org/10.1109/TCE.2019.2899869
Article Google Scholar
Peng X, Sun B, Ali K, Saenko K (2014) Exploring invariances in deep convolutional neural networks using synthetic images, 2(4)
Park S, Mello SD, Molchanov P, Iqbal U, Hilliges O, Kautz J (2019) Few-shot adaptive gaze estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9368–9377
Krafka K, Khosla A, Kellnhofer P, Kannan H, Bhandarkar S, Matusik W, Torralba A (2016) Eye tracking for everyone. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2176–2184
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680, DOI https://doi.org/10.1364/josaa.24.003274, (to appear in print)
Shrivastava A, Pfister T, Tuzel O, Susskind J, Wang W, Webb R (2017) Learning from simulated and unsupervised images through adversarial training. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2107–2116
Yu Y, Gong Z, Zhong P, Shan J (2017) Unsupervised representation learning with deep convolutional neural network for remote sensing images. In: International conference on image and graphics. Springer, pp 97–108
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Wood E, Baltrušaitis T, Morency LP, Robinson P, Bulling A (2016) Learning an appearance-based gaze estimator from one million synthesised images
Kingma D, Ba J (2014) Adam: A method for stochastic optimization. Computer Science
Yang T-Y, Huang Y-H, Lin Y-Y, Hsiu P-C, Chuang Y-Y (2018) Ssr-net: a compact soft stagewise regression network for age estimation. In: IJCAI, vol 5, p 7
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25:1097–1105
Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Lindén E., Sjostrand J, Proutiere A (2019) Learning to personalize in appearance-based gaze tracking. In: Proceedings of the IEEE/CVF international conference on computer vision workshops
Xiong Y, Kim HJ, Singh V (2019) Mixed effects neural networks (menets) with applications to gaze estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7743–7752
Liu G, Yu Y, Mora KAF, Odobez J-M (2018) A differential approach for gaze estimation with calibration. In: BMVC, vol 2, p 6

Download references

Acknowledgements

The research was supported by the Key Laboratory of Spectral Imaging Technology, Xi’an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, the Key laboratory of Biomedical Spectroscopy of Xi’an, the Outstanding Award for Talent Project of the Chinese Academy of Sciences, ”From 0 to 1” Original Innovation Project of the Basic Frontier Scientific Research Program of the Chinese Academy of Sciences, and Institute Supported Project of Xi’an Institute of Optics and Precision Mechanics of Chinese Academy of Sciences under grant number Y855W31213, Y955061213, Dongguan Dongquan Intelligent Technology Co., Ltd. and Dongguan Entrepreneur leadership 2018. We thank Li-Yao Song, Chi Gao, Xin-Ming Zhang, Shao-Kang Yin, and Chao Li for great discussion and editing the paper.

Author information

Authors and Affiliations

Key Laboratory of Spectral Imaging Technology, Xi’an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Xi’an, 710119, China
Quan Wang, Hui Wang, Ruo-Chen Dang, Guang-Pu Zhu, Hai-Feng Pi & Bing-liang Hu
Key Laboratory of Biomedical Spectroscopy of Xi’an, Xi’an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, Xi’an, 710119, China
Quan Wang, Hui Wang, Ruo-Chen Dang, Guang-Pu Zhu, Hai-Feng Pi & Bing-liang Hu
University of Chinese Academy of Sciences, Beijing, 100049, China
Hui Wang & Guang-Pu Zhu
Center for Child Health, Behavior and Development, Seattle Children’s Research Institute, Seattle, WA, 98101, USA
Frederick Shic

Authors

Quan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ruo-Chen Dang
View author publications
You can also search for this author in PubMed Google Scholar
Guang-Pu Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Hai-Feng Pi
View author publications
You can also search for this author in PubMed Google Scholar
Frederick Shic
View author publications
You can also search for this author in PubMed Google Scholar
Bing-liang Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quan Wang.

Ethics declarations

The authors claim no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Quan Wang and Hui Wang These authors contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Q., Wang, H., Dang, RC. et al. Style transformed synthetic images for real world gaze estimation by using residual neural network with embedded personal identities. Appl Intell 53, 2026–2041 (2023). https://doi.org/10.1007/s10489-022-03481-9

Download citation

Accepted: 05 March 2022
Published: 04 May 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03481-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Style transformed synthetic images for real world gaze estimation by using residual neural network with embedded personal identities

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments

BCNet: Binocular Cooperative Network for Gaze Estimation

EM-Gaze: eye context correlation and metric learning for gaze estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Style transformed synthetic images for real world gaze estimation by using residual neural network with embedded personal identities

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments

BCNet: Binocular Cooperative Network for Gaze Estimation

EM-Gaze: eye context correlation and metric learning for gaze estimation

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation