Abstract
Past attempts to model emotions for speech synthesis have focused on extreme, “basic” emotion categories. The present paper suggests an alternative representation of emotional states, by means of emotion dimensions, and explains how this approach can contribute to making speech synthesis a useful component of affective dialogue systems.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Schröder, M.: Speech and emotion research: An overview of research frameworks and a dimensional approach to emotional speech synthesis. PhD thesis, Institute of Phonetics, Saarland University (to appear)
Schröder, M.: Emotional speech synthesis: A review. In: Proceedings of Eurospeech 2001, Aalborg, Denmark, vol. 1, pp. 561–564 (2001)
Cowie, R.: Describing the emotional states expressed in speech. In: Proceedings of the ISCA Workshop on Speech and Emotion, Northern Ireland, pp. 11–18 (2000)
Whissell, C.M.: The dictionary of affect in language. In: Plutchik, R., Kellerman, H. (eds.) Emotion: Theory, Research, and Experience. The Measurement of Emotions, vol. 4, pp. 113–131. Academic Press, New York (1989)
Scherer, K.R.: Emotion as a multicomponent process: A model and some crosscultural data. Review of Personality and Social Psychology 5, 37–63 (1984)
Cornelius, R.R.: The Science of Emotion. In: Research and Tradition in the Psychology of Emotion, Prentice-Hall, Upper Saddle River (1996)
Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: Towards a new generation of databases. Speech Communication Special Issue Speech and Emotion 40, 33–60 (2003)
Shaver, P., Schwartz, J., Kirson, D., O’Connor, C.: Emotion knowledge: Further exploration of a prototype approach. Journal of Personality and Social Psychology 52, 1061–1086 (1987)
Cowie, R., Cornelius, R.R.: Describing the emotional states that are expressed in speech. Speech Communication Special Issue on Speech and Emotion 40, 5–32 (2003)
Cowie, R., Douglas-Cowie, E., Appolloni, B., Taylor, J., Romano, A., Fellenz, W.: What a neural net needs to know about emotion words. In: Mastorakis, N. (ed.) Computational Intelligence and Applications, pp. 109–114. World Scientific & Engineering Society Press, Singapore (1999)
Scherer, K.R.: On the nature and function of emotion: A component process approach. In: Scherer, K.R., Ekman, P. (eds.) Approaches to emotion, pp. 293–317. Erlbaum, Hillsdale (1984)
Scherer, K.R.: Vocal affect expression: A review and a model for future research. Psychological Bulletin 99, 143–165 (1986)
Banse, R., Scherer, K.R.: Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology 70, 614–636 (1996)
Ortony, A., Clore, G.L., Collins, A.: The Cognitive Structure of Emotion. Cambridge University Press, Cambridge (1988)
André, E., Klesen, M., Gebhard, P., Allen, S., Rist, T.: Integrating models of personality and emotions into lifelike characters. In: Proceedings of the Workshop on Affect in Interactions – Towards a new Generation of Interfaces, Siena, Italy, pp. 136–149 (1999)
Mehrabian, A., Russell, J.A.: An Approach to Environmental Psychology. MIT Press, Cambridge (1974)
Osgood, C.E., Suci, G.J., Tannenbaum, P.H.: The measurement of meaning. University of Illinois Press, Urbana (1957)
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18, 32–80 (2001)
Tsapatsoulis, N., Raouzaiou, A., Kollias, S., Cowie, R., Douglas-Cowie, E.: Emotion recognition and synthesis based on MPEG-4 FAPs. In: Pandzic, I.S., Forchheimer, R. (eds.) MPEG-4 Facial Animation - The standard, implementations, applications, John Wiley & Sons, Hillsdale (2002)
Krenn, B., Pirker, H., Grice, M., Piwek, P., van Deemter, K., Schröder, M., Klesen, M., Gstrein, E.: Generation of multimodal dialogue for net environments. In: Proceedings of Konvens, Saarbrücken, Germany (2002)
Schröder, M., Cowie, R., Douglas-Cowie, E., Westerdijk, M., Gielen, S.: Acoustic correlates of emotion dimensions in view of speech synthesis. In: Proceedings of Eurospeech 2001, vol. 1, pp. 87–90. Aalborg, Denmark (2001)
Schröder, M., Trouvain, J.: The German text-to-speech synthesis system MARY: A tool for research, development and teaching. International Journal of Speech Technology 6, 365–377 (2003)
Schröder, M., Grice, M.: Expressing vocal effort in concatenative synthesis. In: Proceedings of the 15th International Conference of Phonetic Sciences, Barcelona, Spain (2003) (to appear)
Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., Schröder, M.: ’FEELTRACE’: An instrument for recording perceived emotion in real time. In: Proceedings of the ISCA Workshop on Speech and Emotion, Northern Ireland, pp. 19–24 (2000)
Tartter, V.C.: Happy talk: Perceptual and acoustic effects of smiling on speech. Perception & Psychophysics 27, 24–27 (1980)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schröder, M. (2004). Dimensional Emotion Representation as a Basis for Speech Synthesis with Non-extreme Emotions. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds) Affective Dialogue Systems. ADS 2004. Lecture Notes in Computer Science(), vol 3068. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24842-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-24842-2_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22143-2
Online ISBN: 978-3-540-24842-2
eBook Packages: Springer Book Archive