Abstract
This paper investigates facial communicative signals (head gestures, eye gaze, and facial expressions) as nonverbal feedback in human-robot interaction. Motivated by a discussion of the literature, we suggest scenario-specific investigations due to the complex nature of these signals and present an object-teaching scenario where subjects teach the names of objects to a robot, which in turn shall term these objects correctly afterwards. The robot’s verbal answers are to elicit facial communicative signals of its interaction partners. We investigated the human ability to recognize this spontaneous facial feedback and also the performance of two automatic recognition approaches. The first one is a static approach yielding baseline results, whereas the second considers the temporal dynamics and achieved classification rates comparable to the human performance.
Similar content being viewed by others
Notes
BIelefeld Robot CompaniON.
Thus, this definition of ground truth circumvents typical problems of the common approaches mentioned above: There is no dependence on the necessarily subjective impressions of human raters, subjects do not need to remember the intended meaning of their various FCS displays during an interview after the experiment, and the displayed FCSs are spontaneous and not posed on request.
The start and end points of the test sequence are given by the manual annotation of the database (Sect. 3.1), as this paper focuses on the principal investigation of FCSs in the described scenario. Nevertheless, an automatic determination of these segment borders as needed for the online-classification on the robot is also possible, using the robot’s system state and simple heuristics. Approximate segment borders are sufficient, as a search for the best-matching subsequences is performed anyway. Also an incremental evaluation without a fixed end point is possible efficiently. However, the details of such a robot system are beyond the scope of this paper.
However, it is not clear to which degree the humans performed a person-dependent or -independent classification. At least some adaptation to the shown people took place in the course of the experiment.
References
Argyle M, Cook M (1976) Gaze and mutual gaze. Cambridge University Press, Cambridge
Argyle M, Graham JA (1976) The central Europe experiment: Looking at persons and looking at objects. J Nonverbal Behav 1(1):6–16
Argyle M, Ingham R (1972) Gaze, mutual gaze, and proximity. Semiotica 6(1):32–49
Bainum CK, Lounsbury KR, Pollio HR (1984) The development of laughing and smiling in nursery school children. Child Dev 55(5):1946–1957
Baker S, Matthews I, Xiao J, Gross R, Kanade, T, Ishikawa, T (2004) Real-time non-rigid driver head tracking for driver mental state estimation. In: 11th World congress Intell transp syst
Barkhuysen P, Krahmer E, Swerts M (2005) Problem detection in human-machine interactions based on facial expressions of users. Speech Commun 45(3):343–359
Bartlett M, Littlewort G, Frank M, Lainscsek C, Fasel I, Movellan J (2006) Fully automatic facial action recognition in spontaneous behavior. In: International conference on automatic face and gesture recognition, pp 223–230
Bavelas JB, Black A, Lemery, CR, Mullett, J (1986) I show how you feel: Motor mimicry as a communicative act. J Pers Soc Psychol 50(2):322–329
Bindemann M, Burton AM, Langton SRH (2007) How do eye gaze and facial expression interact? Vis Cogn 16(6):708–733
Birdwhistell R (1970) Kinesics and context: essays on body motion communication. University of Pennsylvania Press, Philadelphia
Boucher JD, Carlson GE (1980) Recognition of facial expression in three cultures. J Cross-Cult Psychol 11:263–280
Bradley MM, Lang PJ (1994) Measuring emotion: the self-assessment manikin and the semantic differential. J Behav Ther Exp Psychiatry 25(1):49–59
Brightman VJ, Segal AL, Werther P, Steiner J (1975) Ethologic study of facial expressions in response to taste stimuli. J Dent Res 54:L141 (Abstract)
Brightman VJ, Segal AL, Werther P, Steiner J (1977) Facial expression and hedonic response to face stimuli. J Dent Res 56:B161 (Abstract)
Buenaposada JM, Muñoz E, Baumela L (2008) Recognising facial expressions in video sequences. PAA Pattern Anal Appl 11(1):101–116
Bush LE (1973) Individual differences multidimensional scaling of adjectives denoting feelings. J Pers Soc Psychol 25(1):50–57
Caridakis G, Malatesta L, Kessous L, Amir N, Raouzaiou A, Karpouzis K (2006) Modeling naturalistic affective states via facial and vocal expressions recognition. In: International conference on multimodal interfaces, pp 146–154
Castrillón M, Déniz O, Hernández M (2003) The encara system for face detection and normalization. In: Lecture notes in computer science, vol 2652. Springer, Berlin, pp 176–183
Chovil N (1991) Social determinants of facial displays. J Nonverbal Behav 15(3):141–154
Cline MG (1967) The perception of where a person is looking. Am J Psychol 80(1):41–50
Cohn J, Reed L, Ambadar Z, Xiao J, Moriyama T (2004) Automatic analysis and recognition of brow actions and head motion in spontaneous facial behavior. In: International conference on systems, man and cybernetics, pp 610–616
Cooper RM (2006) The effects of eye gaze and emotional facial expression on the allocation of visual attention. Ph.D. thesis, University of Stirling, Department of Psychology
Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. Pattern Anal Mach Intell 23(6):681–685
Ducci L, Arcuri L, Georgis WT, Sineshaw T (1982) Emotion recognition in Ethiopia: the effect of familiarity with western culture on accuracy of recognition. J Cross-Cult Psychol 13:340–351
Edwards G, Cootes T, Taylor C (1998) Face recognition using active appearance models. In: Burkhardt H, Neumann B (eds) European conference on computer vision, vol 2. Springer, Berlin, pp 581–695
Efran JS (1968) Looking for approval: effects on visual behavior of approbation from persons differing in importance. J Pers Soc Psychol 10(1):21–25
Ekman P (1971) Universals and cultural differences in facial expressions of emotion. Neb Symp Motiv 19:207–283
Ekman P (1992) An argument for basic emotions. Cogn Emot 6(3 & 4):169–200
Ekman P (1994) Strong evidence for universals in facial expressions: a reply to Russell’s mistaken critique. Psychol Bull 115(2):268–287
Ekman P (1997) Should we call it expression or communication? Innovation 10(4):333–344
Ekman P, Friesen W (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto
Ekman P, Davidson RJ, Friesen WV (1990) The duchenne smile: emotional expression and brain physiology ii. J Pers Soc Psychol 58(2):342–353
Ekman P, Friesen WV, O’Sullivan M, Chan A, Diacoyanni-Tarlatzis I, Heider K, Krause R, LeCompte WA, Pitcairn T, Ricci-Bitti PE, Scherer K, Tomita M, Tzavaras A (1987) Universals and cultural differences in the judgments of facial expressions of emotion. J Pers Soc Psychol 53(4):712–717
Ekman P, Sorenson ER, Friesen WV (1969) Pan-cultural elements in facial displays of emotion. Science 164:86–88
El Kaliouby R, Robinson P (2007) Real-time inference of complex mental states from facial expressions and head gestures. Trans Robot 23(5):991–1000
Emery NJ (2000) The eyes have it: the neuroethology, function and evolution of social gaze. Neurosci Biobehav Rev 24(6):581–604
Exline RV (1963) Explorations in the process of person perception: visual interaction in relation to competition, sex, and need for affiliation. J Pers 31(1):1–20
Exline R, Gray D, Schuette D (1965) Visual behavior in a dyad as affected by interview content and sex of respondent. J Pers Soc Psychol 1(3):201–209
Exline RV, Winter LC (1965) Affect, cognition and personality, chap. Affective relations and mutual glances in dyads. Springer, New York
Fasel B, Luettin J (2003) Automatic facial expression analysis: a survey. Pattern Recognit 36:259–275
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Fragopanagos N, Taylor J (2005) Emotion recognition in human-computer interaction. Neural Netw 18(4):389–405
Fridlund AJ (1991) Sociality of solitary smiling: Potentiation by an implicit audience. J Pers Soc Psychol 60(2):229–240
Fridlund AJ (1994) Human facial expression: an evolutionary view. Academic Press, San Diego
Fridlund AJ, Sabini JP, Hedlund LE, Schaut JA, Shenker JI, Knauer MJ (1990) Audience effects on solitary faces during imagery: displaying to the people in your head. J Nonverbal Behav 14(2):113–137
Friesen CK, Moore C, Kingstone A (2005) Does gaze direction really trigger a reflexive shift of spatial attention? Brain Cogn 57(1):66–69
Goodwin MH, Goodwin C (1986) Gesture and coparticipation in the activity of searching for a word. Semiotica 62(1–2):51–76
Graf HP, Cosatto E, Strom V, Huang FJ (2002) Visual prosody: Facial movements accompanying speech. In: International conference on automatic face and gesture recognition, pp 396–401
Green RF, Goldfried MR (1965) On the bipolarity of semantic space. Psychol Monogr Gen Appl 79(6, Whole No. 599):31
Gross R, Matthews I, Baker S (2005) Generic vs. person specific active appearance models. Image Vis Comput 23(12):1080–1093
Gunes H, Pantic M (2010) Dimensional emotion prediction from spontaneous head gestures for interaction with sensitive artificial listeners. In: International conference on intelligent virtual agents, pp 371–377
Haasch A, Hohenner S, Hüwel S, Kleinehagenbrock M, Lang S, Toptsis I, Fink GA, Fritsch J, Wrede B, Sagerer G (2004) Biron—the Bielefeld robot companion. In: Prassler E, Lawitzky G, Fiorini P, Haegele M (eds) International workshop on advances in service robotics, pp 27–32
Hadar U, Steiner TJ, Grant EC, Rose FC (1983) Kinematics of head movements accompanying speech during conversation. Hum Mov Sci 2(1–2):35–46
Heylen D (2005) Challenges ahead: Head movements and other social acts in conversations. In: Joint symposium on virtual social agents, pp 45–52
Heylen D (2006) Head gestures, gaze and the principles of conversational structure. Int J Humanoid Robot 3(3):1–27
Hugot V (2007) Eye gaze analysis in human-human interactions. Master’s thesis, KTH Royal Institute of Technology, School of Computer Science and Communication, Stockholm, Sweden
Huppert FA, Whittington JE (2003) Evidence for the independence of positive and negative well-being: implications for quality of life assessment. Br J Health Psychol 8:107–122
Ishikawa T, Baker S, Matthews I, Kanade T (2004) Passive driver gaze tracking with active appearance models. In: 11th World congress on intelligent transportation systems
Ivan P (2007) Active appearance models for gaze estimation. Master’s thesis, Vrije Universiteit Amsterdam, Faculty of Sciences, Business Mathematics & Informatics
Izard CE (1971) The face of emotion. Appleton-Century-Crofts, New York
Izard CE (1994) Innate and universal facial expressions: Evidence from developmental and cross-cultural research. Psychol Bull 115(2):288–299
Izard CE (2010) The many meanings/aspects of emotion: definitions, functions, activation, and regulation. Emot Rev 2(4):363–370
Jakobs E, Manstead ASR, Fischer AH (2001) Social context effects on facial activity in a negative emotional setting. Emotion 1(1):51–69
Kapoor A, Picard RW (2005) Multimodal affect recognition in learning environments. In: ACM international conference on multimedia. ACM Press, New York, pp 677–682
Kleck RE, Vaughan RC, Cartwright-Smith J, Vaughan KB, Colby CZ, Lanzetta JT (1976) Effects of being observed on expressive, subjective, and physiological responses to painful stimuli. J Pers Soc Psychol 34(6):1211–1218
Kobayashi H, Kohshima S (1997) Unique morphology of the human eye. Nature 387:767–768
Kraut RE (1982) Social presence, facial feedback, and emotion. J Pers Soc Psychol 42(5):853–863
Kraut RE, Johnston RE (1979) Social and emotional messages of smiling: An ethological approach. J Pers Soc Psychol 37(9):1539–1553
Lang C, Hanheide M, Lohse M, Wersing H, Sagerer G (2009) Feedback interpretation based on facial expressions in human-robot interaction. In: International symposium on robot and human interactive communication (RO-MAN), pp 189–194
Lang C, Wachsmuth S, Hanheide M, Wersing H (2012) Facial communicative signal interpretation in human-robot interaction by discriminative video subsequence selection. Tech. rep., Bielefeld University, Faculty of Technology, Research Institute for Cognition and Robotics/Applied Informatics
Lang C, Wachsmuth S, Wersing H, Hanheide M (2010) Facial expressions as feedback cue in human-robot interaction—a comparison between human and automatic recognition performances. In: Workshop on CVPR for human communicative behavior analysis (CVPR4HB), pp 79–85
Langton SRH, Watt RJ, Bruce V (2000) Do the eyes have it? Cues to the direction of social attention. Trends Cogn Sci 4(2):50–59
Lanitis A, Taylor CJ, Cootes TF (1997) Automatic interpretation and coding of face images using flexible models. Pattern Anal Mach Intell 19(7):743–756
Lee J, Chao C, Bobick AF, Thomaz AL (2012) Multi-cue contingency detection. Int J Soc Robot. doi:10.1007/s12369-011-0136-5
Lohan KS, Rohlfing KJ, Pitsch K, Saunders J, Lehmann H, Nehaniv CL, Fischer K, Wrede B (2012) Tutor spotter: Proposing a feature set and evaluating it in a robotic system. Int J Soc Robot. doi:10.1007/s12369-011-0125-8
Lohan KS, Vollmer AL, Fritsch J, Rohlfing K, Wrede B (2009) Which ostensive stimuli can be used for a robot to detect and maintain tutoring situations. In: International workshop on social signal processing
Lohse M, Hanheide M (2008) Evaluating a social home tour robot applying heuristics. In: Workshop robots as social actors at RO-MAN
Lohse M, Rohlfing KJ, Wrede B, Sagerer G (2008) Try something else! when users change their discursive behavior in human-robot interaction. In: International conference on robotics and automation
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Lucey S, Ashraf AB, Cohn JF (2007) Face recognition. In: Investigating spontaneous facial action recognition through AAM representations of the face. Itech Education and Publishing, Vienna, pp 275–286
Ma Y, Konishi Y, Kinoshita K, Lao S, Kawade M (2006) Sparse Bayesian regression for head pose estimation. In: International conference on pattern recognition, pp 507–510
Matthews I, Baker S (2004) Active appearance models revisited. Int J Comput Vis 60:135–164
Maynard-Smith J, Harper D (2004) Animal signals. Oxford University Press, London
McAndrew FT (1986) A cross-cultural study of recognition thresholds for facial expressions of emotion. J Cross-Cult Psychol 17:211–224
McLachlan JFC (1976) A short adjective check list for the evaluation of anxiety and depression. J Clin Psychol 32(1):195–197
McNaira DM, Lorr M (1964) An analysis of mood in neurotics. J Abnorm Soc Psychol 69(6):620–627
Mehrabian A, Friar JT (1969) Encoding of attitude by a seated communicator via posture and position cues. J Consult Clin Psychol 33(3):330–336
Mehrabian A, Russell JA (1974) An approach to environmental psychology. The MIT Press, Cambridge
Morimoto CH, Mimica MR (2005) Eye gaze tracking techniques for interactive applications. Comput Vis Image Underst 98(1):4–24
Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. Pattern Anal Mach Intell 31(4):607–626
Newman R, Matsumoto Y, Rougeaux S, Zelinsky A (2000) Real-time stereo tracking for head pose and gaze estimation. In: International conference on automatic face and gesture recognition, pp 122–128
Niit T, Valsiner J (1977) Recognition of facial expressions: An experimental investigation of Ekman’s model. Ada Comment Univ Tarvensis 429:85–107
Nowozin S, Bakir G, Tsuda K (2007) Discriminative subsequence mining for action classification. In: International conference on computer vision, pp 1–8
Osgood C, Suci G, Tannenbaum P (1957) The measurement of meaning. University of Illinois Press, Urbana
Osgood CE (1966) Dimensionality of the semantic space for communication via facial expressions. Scand J Psychol 7(1):1–30
Pantic M, Rothkrantz LJM (2000) Automatic analysis of facial expressions: the state of the art. Pattern Anal Mach Intell 22(12):1424–1445
Parkinson B (2005) Do facial movements express emotions or communicate motives? Personal Soc Psychol Rev 9(4):278–311
Poggi I, D’Errico F, Vincze L (2010) Types of nods. the polysemy of a social signal. In: International conference on language resources and evaluation, pp 17–23
Posner J, Russell JA, Peterson BS (2005) The circumplex model of affect: an integrative approach to affective neuroscience cognitive development, and psychopathology. Dev Psychopathol 17:715–734
Prado JA, Simplício C, Lori NF, Dias J (2012) Visuo-auditory multimodal emotional structure to improve human-robot-interaction. Int J Soc Robot 4(1):29–51
Provine RR, Fischer KR (1989) Laughing, smiling, and talking: relation to sleeping and social context in humans. Ethology 83(4):295–305
Rabie A, Lang C, Hanheide M, Castrillón-Santana M, Sagerer G (2008) Automatic initialization for facial analysis in interactive robotics. In: International conference on computer vision systems, pp 517–526
Rabie A, Wrede B, Vogt T, Hanheide M (2009) Evaluation and discussion of multi-modal emotion recognition. In: International conference on computer and electrical engineering, vol 1, pp 598–602
Ricciardelli P, Baylis G, Driver J (2000) The positive and negative of human expertise in gaze perception. Cognition 77(1):B1–B14
Russell JA (1978) Evidence of convergent validity on the dimensions of affect. J Cross-Cult Psychol 36(10):1152–1168
Russell JA (1979) Affective space is bipolar. J Pers Soc Psychol 37(3):345–356
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161–1178
Russell JA (1994) Is there universal recognition of emotion from facial expression? a review of the cross-cultural studies. Psychol Bull 115(1):102–141
Russell JA (1995) Facial expressions of emotion: what lies beyond minimal universality? Psychol Bull 118(3):379–391
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. Acoust Speech Signal Process 26(1):43–49
Sander D, Grandjean D, Kaiser S, Wehrle T, Scherer KR (2007) Interaction effects of perceived gaze direction and dynamic facial expression: evidence for appraisal theories of emotion. Eur J Cogn Psychol 19(3):470–480
Sebe N, Cohen I, Gevers T, Huang TS (2006) Emotion recognition based on joint visual and audio cues. In: International conference on pattern recognition, vol 1, pp 1136–1139
Sebe N, Lew MS, Sun Y, Cohen I, Gevers T, Huang TS (2007) Authentic facial expression analysis. Image Vis Comput 25(12):1856–1863
Shacham S, Dar R, Cleeland CS (1984) The relationship of mood state to the severity of clinical pain. Pain 18(2):187–197
Shimada M, Yoshikawa Y, Asada M, Saiwaki N, Ishiguro H (2011) Effects of observing eye contact between a robot and another person. Int J Soc Robot 3(2):143–154
Snider JG, Osgood CE (eds) (1969) Semantic differential technique. Aldine, Chicago
Storti C (2007) Speaking of India: bridging the communication gap when working with Indians. In: Yes, no, and other problems. Nicholas Brealey, Boston. pp 35–76
Thayer RE (1967) Measurement of activation through self-report. Psychol Rep 20(2):663–678
Tian YL, Kanade T, Cohn JF (2001) Recognizing action units for facial expression analysis. Pattern Anal Mach Intell 23(2):97–115
Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244
Valstar MF, Gunes H, Pantic M (2007) How to distinguish posed from spontaneous smiles using geometric features. In: International conference on multimodal interfaces, pp 38–45
Varchmin AC, Rae R, Ritter H (1997) Image based recognition of gaze direction using adaptive methods. In: Gesture and sign language in human-computer interaction. International gesture workshop, Bielefeld, Germany
von Grünau M, Anston C (1995) The detection of gaze direction: a stare-in-the-crowd effect. Perception 24(11):1297–1313
Wang JG, Sung E (2002) Study on eye gaze estimation. Syst Man Cybern 32(3):332–350
Wang JG, Sung E (2007) Em enhancement of 3d head pose estimated by point at infinity. Image Vis Comput 25:1864–1874
Watson OM (1970) Proxemic behavior: a cross-cultural study. Mouton De Gruyter, Berlin
Westbrook MT (1976) Positive affect: A method of content analysis for verbal samples. J Consult Clin Psychol 44(5):715–719
Widen SC, Russell JA (2010) Descriptive and prescriptive definitions of emotion. Emot Rev 2(4):377–378
Yang P, Liu Q, Cui X, Metaxas DN (2008) Facial expression recognition based on dynamic binary patterns. In: Conference on computer vision and pattern recognition
Yeasin M, Bullot B, Sharma R (2006) Recognition of facial expressions and measurement of levels of interest from video. Multimedia 8(3):500–508
Yoo DH, Kim JH, Lee BR, Chung MJ (2002) Non-contact eye gaze tracking system by mapping of corneal reflections. In: International conference on automatic face and gesture recognition, pp 94–99
Zeng Z, Pantic M, Roisman GI, Huang TS (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. Pattern Anal Mach Intell 31(1):39–58
Zhao G, Chen L, Song J, Chen G (2007) Large head movement tracking using sift-based registration. In: International conference on multimedia, pp 807–810
Acknowledgements
Christian Lang gratefully acknowledges the financial support from Honda Research Institute Europe for the project “Facial Expressions in Communication”. The authors thank the anonymous reviewers for their helpful comments on an earlier draft of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been supported by the Honda Research Institute Europe, Offenbach, Germany.
Rights and permissions
About this article
Cite this article
Lang, C., Wachsmuth, S., Hanheide, M. et al. Facial Communicative Signals. Int J of Soc Robotics 4, 249–262 (2012). https://doi.org/10.1007/s12369-012-0145-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-012-0145-z