Abstract
The ability to detect a human’s contingent response is an essential skill for a social robot attempting to engage new interaction partners or maintain ongoing turn-taking interactions. Prior work on contingency detection focuses on single cues from isolated channels, such as changes in gaze, motion, or sound. We propose a framework that integrates multiple cues for detecting contingency from multimodal sensor data in human-robot interaction scenarios. We describe three levels of integration and discuss our method for performing sensor fusion at each of these levels. We perform a Wizard-of-Oz data collection experiment in a turn-taking scenario in which our humanoid robot plays the turn-taking imitation game “Simon says” with human partners. Using this data set, which includes motion and body pose cues from a depth and color image and audio cues from a microphone, we evaluate our contingency detection module with the proposed integration mechanisms and show gains in accuracy of our multi-cue approach over single-cue contingency detection. We show the importance of selecting the appropriate level of cue integration as well as the implications of varying the referent event parameter.
Similar content being viewed by others
References
Lohan K, Vollmer A, Fritsch J, Rohlfing K, Wrede B (2009) Which ostensive stimuli can be used for a robot to detect and maintain tutoring situations? In: ACII workshop
Pitsch K, Kuzuoka H, Suzuki Y, Sussenbach L, Luff P, Heath C (2009) The first five seconds: Contingent stepwise entry into an interaction as a means to secure sustained engagement. In: IEEE international symposium on robot and human interactive communication (ROMAN)
Lee J, Kiser J, Bobick A, Thomaz A (2011) Vision-based contingency detection. In: ACM/IEEE international conference on human-robot interaction (HRI)
Chao C, Lee J, Begum M, Thomaz A (2011) Simon plays Simon says: The timing of turn-taking in an imitation game. In: IEEE international symposium on robot and human interactive communication (ROMAN)
Sumioka H, Yoshikawa Y, Asada M (2010) Reproducing interaction contingency toward open-ended development of social actions: Case study on joint attention. In: IEEE transactions on autonomous mental development
Triesch J, Teuscher C, Deak G, Carlson E (2006) Gaze following: why (not) learn it? In: Developmental science
Butko N, Movellan J (2010) Infomax control of eye movements. In: IEEE transactions on autonomous mental development
Csibra G, Gergely G (2006) Social learning and social cognition: The case for pedagogy. In: Processes of changes in brain and cognitive development. attention and performance. Oxford University Press, London
Gold K, Scassellati B (2006) Learning acceptable windows of contingency. In: Connection science
Watson J (1972) Smiling, cooling, and ’the game’. In Merrill Palmer quarterly
Watson J (1979) The perception of contingency as a determinant of social responsiveness. In: Origins of the infant’s social responsiveness
Gold K, Scassellati B (2009) Using probabilistic reasoning over time to self-recognize. In: Robotics and autonomous systems
Stoytchev A (2011) Self-detection in robots: a method based on detecting temporal contingencies. In: Robotica. Cambridge University Press, Cambridge
Multu B, Shiwa T, Ishiguro T, Hagita N (2009) Footing in human-robot conversations: how robots might shape participant roles using gaze cues. In: ACM/IEEE international conference on human-robot interaction (HRI)
Rich C, Ponsler B, Holroyd A, Sidner C (2010) Recognizing engagement in human-robot interaction. In: ACM international conference on human-robot interaction (HRI)
Michalowski M, Sabanovic S, Simmons R (2006) A spatial model of engagement for a social robot. In: International workshop on advanced motion control (AMC)
Muller S, Hellbach S, Schaffernicht E, Ober A, Scheidig A, Gross H (2008) Whom to talk to? Estimating user interest from movement trajectories. In: IEEE international symposium on robot and human interactive communication (ROMAN)
Butko N, Movellan J (2010) Detecting contingencies: an infomax approach. In: IEEE transactions on neural networks
Hall D, Llinas J (1997) An introduction to multisensor data fusion. In: Proceedings of the IEEE
Poppe R (2010) A survey on vision-based human action recognition. In: Image and vision computing
The OpenNI API. http://www.openni.org
Werlberger M, Trobin W, Pock T, Wedel A, Cremers D, Bishof H (2009) Anisotropic huber-l1 optical flow. In: Proceedings of the British machine vision conference (BMVC)
Lee D, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Advances in neural information processing
Author information
Authors and Affiliations
Corresponding author
Additional information
ONR YIP N000140810842.
Rights and permissions
About this article
Cite this article
Lee, J., Chao, C., Bobick, A.F. et al. Multi-cue Contingency Detection. Int J of Soc Robotics 4, 147–161 (2012). https://doi.org/10.1007/s12369-011-0136-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-011-0136-5