Abstract
Robots need to identify environmental cues like humans do for effective human-robot interaction (HRI). Human attention models simulate the way humans process visual information, making them useful for identifying important regions in images/videos. In this paper, we explore the use of human attention models in developing intuitive and anthropomorphic HRI. Our approach combines a saliency model and a moving object detection model. The framework is implemented using the ROS framework on Pepper, a humanoid robot. To evaluate the effectiveness of our system, we conducted both subjective and qualitative measures, including subjective rating measures to evaluate intuitiveness, trust, engagement, and user satisfaction, and quantitative measures of our human attention subsystem against state-of-the-art models. Our extensive experiments demonstrate the significant impact of our framework in enabling intuitive and anthropomorphic human-robot interaction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Pepper paying attention to moving bodies.
- 2.
Pepper interacting with temporarily salient body.
- 3.
Pepper acting humanly in a still environment.
- 4.
Pepper attending a very dynamic environment and on the move.
- 5.
- 6.
Intel Atom™ E3845 @ 1.91GHz x 4.
References
Andreasson, R., Alenljung, B., Billing, E., Lowe, R.: Affective touch in human-robot interaction: conveying emotion to the nao robot. Int. J. Soc. Robot. 10, 473–491 (2018)
Beer, J.M., Fisk, A.D., Rogers, W.A.: Toward a framework for levels of robot autonomy in human-robot interaction. J. Hum.-Robot Interact. 3(2), 74 (2014)
Duchaine, V., Gosselin, C.: Safe, stable and intuitive control for physical human-robot interaction. In: 2009 IEEE International Conference on Robotics and Automation, pp. 3383–3388. IEEE (2009)
Fan, J., Zheng, P., Li, S.: Vision-based holistic scene understanding towards proactive human-robot collaboration. Robot. Comput.-Integr. Manuf. 75, 102304 (2022)
Fong, T., Nourbakhsh, I., Dautenhahn, K.: A survey of socially interactive robots. Robot. Auton. Syst. 42(3–4), 143–166 (2003)
Hommel, B., Müsseler, J., Aschersleben, G., Prinz, W.: The theory of event coding (TEC): a framework for perception and action planning. Behav. Brain Sci. 24(5), 849–878 (2001)
Huang, C.M., Andrist, S., Sauppé, A., Mutlu, B.: Using gaze patterns to predict task intent in collaboration. Front. Psychol. 6, 1049 (2015)
Jiang, L., Xu, M., Liu, T., Qiao, M., Wang, Z.: Deepvs: a deep learning based video saliency prediction approach. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 602–617 (2018)
Jost, T., Ouerhani, N., Von Wartburg, R., Müri, R., Hügli, H.: Assessing the contribution of color in visual attention. Comput. Vis. Image Underst. 100(1–2), 107–123 (2005)
Kahn, P.H., Jr., et al.: What is a human?: toward psychological benchmarks in the field of human-robot interaction. Interact. Stud. 8(3), 363–390 (2007)
Kiesler, S., Powers, A., Fussell, S.R., Torrey, C.: Anthropomorphic interactions with a robot and robot-like agent. Soc. Cogn. 26(2), 169–181 (2008)
Li, H., Chen, G., Li, G., Yu, Y.: Motion guided attention for video salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7274–7283 (2019)
Li, X.: Human-robot interaction based on gesture and movement recognition. Signal Process. Image Commun. 81, 115686 (2020)
Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vision. Res. 45(18), 2397–2416 (2005)
Petersen, S.E., Posner, M.I.: The attention system of the human brain: 20 years after. Annu. Rev. Neurosci. 35, 73–89 (2012)
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education Limited, London (2016)
Saran, A., Majumdar, S., Short, E.S., Thomaz, A., Niekum, S.: Human gaze following for human-robot interaction. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8615–8621. IEEE (2018)
Schauerte, B., Fink, G.A.: Focusing computational visual attention in multi-modal human-robot interaction. In: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction, pp. 1–8 (2010)
Sheridan, T.B.: Human-robot interaction: status and challenges. Hum. Factors 58(4), 525–532 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Tatler, B.W., Baddeley, R.J., Gilchrist, I.D.: Visual correlates of fixation selection: effects of scale and time. Vision. Res. 45(5), 643–659 (2005)
Wang, W., Shen, J., Guo, F., Cheng, M.M., Borji, A.: Revisiting video saliency: a large-scale benchmark and a new model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4894–4903 (2018)
Wang, Y., Jodoin, P.M., Porikli, F., Konrad, J., Benezeth, Y., Ishwar, P.: CDnet 2014: an expanded change detection benchmark dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 387–394 (2014)
Wondimu, N., Visser, U., Buche, C.: Interactive video saliency prediction: the stacked-ConvLSTM approach. In: 15th International Conference on Agents and Artificial Intelligence, pp. 157–168. SCITEPRESS-Science and Technology Publications (2023)
Wondimu, N., Visser, U., Buche, C.: A new approach to moving object detection and segmentation: the XY-shift frame differencing. In: 15th International Conference on Agents and Artificial Intelligence, pp. 309–318. SCITEPRESS-Science and Technology Publications; SCITEPRESS-Science and \(\ldots \) (2023)
Wondimu, N.A., Buche, C., Visser, U.: Interactive machine learning: a state of the art review. arXiv preprint arXiv:2207.06196 (2022)
Acknowledgment
This work would not have been possible without the financial support of the Brittany region administration, French Embassy in Ethiopia and the Ethiopia Ministry of Education (MoE). We are also indebted to Brest National School of Engineering (ENIB) and specifically LAB-STICC for creating such a conducive research environment.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wondimu, N., Neau, M., Dizet, A., Visser, U., Buche, C. (2024). Anthropomorphic Human-Robot Interaction Framework: Attention Based Approach. In: Buche, C., Rossi, A., Simões, M., Visser, U. (eds) RoboCup 2023: Robot World Cup XXVI. RoboCup 2023. Lecture Notes in Computer Science(), vol 14140. Springer, Cham. https://doi.org/10.1007/978-3-031-55015-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-55015-7_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-55014-0
Online ISBN: 978-3-031-55015-7
eBook Packages: Computer ScienceComputer Science (R0)