Abstract
In this study, we investigate the problem of apparent personality recognition using person’s voice, or more precisely, the way he or she speaks. Based on the style transfer idea in deep neural net image processing, we developed a system capable of speaking style extraction from recorded speech utterances, which then uses this information to estimate the so called Big-Five personality traits. The latent speaking style space is represented by the Gram matrix of convoluted acoustic features. We used a database with labels of personality traits perceived by other people (first impression). The experimental results showed that the proposed system achieves state of the art results for the task of audio based apparent personality recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Boyle, G., Helmes, E.: Methods of personality assessment. In: The Cambridge Handbook of Personality Psychology, p. 110. Cambridge University Press, Cambridge (2009)
Celli, F., Rossi, L.: The role of emotional stability in twitter conversations. In: Proceedings of the Workshop on Semantic Analysis in Social Media, pp. 10–17. Association for Computational Linguistics (2012)
Chastagnol, C., Devillers, L.: Personality traits detection using a parallelized modified SFFS algorithm. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Chu, W.T., Wu, Y.L.: Image style classification based on learnt deep correlation features. IEEE Trans. Multimed. 20(9), 2491–2502 (2018)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Cohen, A.S., Elvevåg, B.: Automated computerized analysis of speech in psychiatric disorders. Curr. Opin. Psychiatry 27(3), 203 (2014)
Costa Jr., P.T., McCrae, R.R.: Domains and facets: hierarchical personality assessment using the revised NEO personality inventory. J. Pers. Assess. 64(1), 21–50 (1995)
Denscombe, M.: The Good Research Guide: For Small-Scale Social Research Projects. McGraw-Hill Education, London (2014)
Escalante, H.J., et al.: Explaining first impressions: modeling, recognizing, and explaining apparent personality from videos. arXiv preprint arXiv:1802.00745 (2018)
Eyben, F., Wöllmer, M., Schuller, B.: Opensmile: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Gürpinar, F., Kaya, H., Salah, A.A.: Multimodal fusion of audio, scene, and face features for first impression estimation. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 43–48. IEEE (2016)
Ivanov, A., Chen, X.: Modulation spectrum analysis for speaker personality trait recognition. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Kaya, H., Gürpınar, F., Salah, A.A.: Multi-modal score fusion and decision trees for explainable automatic job candidate screening from video CVs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–9 (2017)
Kaya, H., Salah, A.A.: Multimodal personality trait analysis for explainable modeling of job interview decisions. In: Explainable and Interpretable Models in Computer Vision and Machine Learning, pp. 255–275. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98131-4
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–500 (2007)
Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning-based document modeling for personality detection from text. IEEE Intell. Syst. 32(2), 74–79 (2017)
Matthews, G., Deary, I.J., Whiteman, M.C.: Personality Traits. Cambridge University Press, New York (2003)
Mohammadi, G., Vinciarelli, A.: Automatic personality perception: prediction of trait attribution based on prosodic features. IEEE Trans. Affect. Comput. 3(3), 273–284 (2012)
Montacié, C., Caraty, M.J.: Pitch and intonation contribution to speakers’ traits classification. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Pianesi, F., Mana, N., Cappelletti, A., Lepri, B., Zancanaro, M.: Multimodal recognition of personality traits in social interactions. In: Proceedings of the 10th International Conference on Multimodal Interfaces, pp. 53–60. ACM (2008)
Piwek, L., Ellis, D.A., Andrews, S., Joinson, A.: The rise of consumer health wearables: promises and barriers. PLoS Med. 13(2), e1001953 (2016)
Polzehl, T., Moller, S., Metze, F.: Automatically assessing personality from speech. In: 2010 IEEE Fourth International Conference on Semantic Computing, pp. 134–140. IEEE (2010)
Sanchez, M.H., Lawson, A., Vergyri, D., Bratt, H.: Multi-system fusion of extended context prosodic and cepstral features for paralinguistic speaker trait classification. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Schuller, B., et al.: The interspeech 2012 speaker trait challenge. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Vinciarelli, A., Mohammadi, G.: A survey of personality computing. IEEE Trans. Affect. Comput. 5(3), 273–291 (2014)
Vinciarelli, A., et al.: Bridging the gap between social animal and unsocial machine: a survey of social signal processing. IEEE Trans. Affect. Comput. 3(1), 69–87 (2012)
Yu, J., Markov, K.: Deep learning based personality recognition from facebook status updates. In: 2017 IEEE 8th International Conference on Awareness Science and Technology (iCAST), pp. 383–387. IEEE (2017)
Zhang, C.L., Zhang, H., Wei, X.S., Wu, J.: Deep bimodal regression for apparent personality analysis. In: European Conference on Computer Vision, pp. 311–324. Springer (2016). https://doi.org/10.1007/978-3-319-49409-8_25
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, J., Markov, K., Karpov, A. (2019). Speaking Style Based Apparent Personality Recognition. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_55
Download citation
DOI: https://doi.org/10.1007/978-3-030-26061-3_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26060-6
Online ISBN: 978-3-030-26061-3
eBook Packages: Computer ScienceComputer Science (R0)