Abstract
This chapter focuses on the automatic identification of demographic traits and identity in both speech and writing. We address language use in the virtual world of online games and text entry on mobile devices in the form of chat, email and nicknames, and demonstrate text factors that correlate with demographics, such as age, gender, personality, and interaction style. Also presented here is work on speakers identification in spontaneous language use, where we describe the state of the art in verification, feature extraction, modeling and calibration across multiple environmental conditions. Finally, we bring speech and writing together to explore approaches to user authentication that span language in general. We discuss how speech-specific factors such as intonation, and writing-specific features such as spelling, punctuation, and typing correction correlate and predict one another as a function of users’ sociolinguistic characteristics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brümmer N (2006) FoCal II: toolkit for calibration of multi-class recognition scores. Software available at http://www.dsp.sun.ac.za/~nbrummer/focal/index.htm. August 2006
Brümmer N et al (2007) Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006. IEEE Trans Audio Speech Lang Process 15(7):2072–2084
Brümmer N, van Leeuwen D (2006) On calibration of language recognition scores. In: Proceedings of the speaker and language recognition workshop, Puerto Rico, Odyssey
Ching M (1982) The question intonation in assertions. Am Speech 57:95–107
Dehak N et al (2010) Frontend factor analysis for speaker verification. IEEE Trans ASLP 19(4):788–798
Dieterle E, Murray J (2011) Virtual environment real user study: design and methodological considerations and implications. J Appl Learn Technol 1(1):19–25
Ferrer L et al (2010) A unified approach for audio characterization and its application to speaker recognition. In: Proceedings of the speaker and language recognition workshop, Odyssey 2010, Brno
Ferrer L et al (2011) Promoting robustness for speaker modeling in the community: the PRISM evaluation set. In: Proceedings of SRE11 analysis workshop, December 2011
Flach P, Lachiche N (2001) Confirmation-guided discovery of first-order rules with tertius. Mach Learn 42(1–2):61–95
Garcia-Romero D, Espy-Wilson C (2011) Analysis of i-vector length normalization in speaker recognition systems. In: Proceedings interspeech, Florence
Hall M et al (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18
Herring S (1994) Gender differences in computer-mediated communication: bringing familiar baggage to the new frontier. In: American library association annual convention, Miami
Herring S, Paolillo J (2006) Gender and genre variation in weblogs. J Sociolinguist 10(4):439–459
Kenny P (2010) Bayesian speaker verification with heavy-tailed priors. In: IEEE Odyssey 2010—the speaker and language recognition workshop, 29 June 2010
Kenny P et al (2008) A study of inter-speaker variability in speaker verification. IEEE Trans ASLP 16(5):980–988
Kim C, Stern R (2010) Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring. In: Proceedings of IEEE international conference on acoustics speech and signal processing (ICASSP), pp 4574–4577
Kim C, Stern R (2012) Power-normalized Cepstral coefficients (PNCC) for robust speech recognition. In: Proceedings of IEEE international conference acoustics, speech and signal processing (ICASSP), Kyoto, 25–30 March 2012
Kockmann M et al (2011) i-Vector fusion of Prosodic and Cepstral features for speaker verification. In: Proceedings of interspeech, Florence
Lakoff R (1975) Language and woman’s place. Harper & Row, New York
Lawson A et al (2012) Sociolinguistic factors and gender mapping across real and virtual world cultures. In: 2nd international conference on cross-cultural decision making, San Francisco, July 2012
Lawson A, Murray J (2014) Identifying user demographic traits through virtual-world language use. In: Ahmad MA, Shen C, Srivastava J, Contractor N (eds) Predicting real world behaviors from virtual world data. Springer, London
Lawson A, Taylor N (2012) The names people play: exploring MMOG players’ Avatar naming conventions. In: Canadian games studies association symposium, May 2012
Lee B, Ellis D (2012) Noise robust pitch tracking by subband autocorrelation classification. In: Proceedings of interspeech, Portland
Lei Y et al (2012) Towards noise-robust speaker recognition using probabilistic linear discriminant analysis. In: Proceedings of IEEE international conference acoustics, speech and signal processing (ICASSP), Kyoto, 25–30 March 2012
Lei Y et al (2014) A novel scheme for speaker recognition using a phonetically-aware deep neural network. In: ICASSP 2014, Florence
Martin A et al (1997) The DET curve in assessment of detection task performance. In: Proceedings Eurospeech, pp 1899–1903
McLaren M et al (2013a) Improving speaker identification robustness to highly channel-degraded speech through multiple system fusion. In: Proceedings of ICASSP, Vancouver
McLaren M et al (2013b) Improving robustness to compressed speech in speaker recognition. In: Proceedings of interspeech, pp 3698–3702
Mitra V et al (2012) Normalized amplitude modulation features for large vocabulary noise-robust speech recognition. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), Kyoto, 25–30 March 2012
Murray J et al (2012) Virtual environment real user study (verus): final project report. AFRL-RY-WP-TR-2012-0286, Air Force Research Laboratory
NIST SRE12 Evaluation Plan (2012) http://www.nist.gov/itl/iad/mig/upload/NIST_SRE12evalplan-v17--r1.pdf
O’Barr WM, Atkins BK (1980) Women’s language or powerless language? In: McConnell-Ginet S, Borker, N, Thurman R (eds) Women and Language in Literature and Society. Praeger, New York, pp 93–110
Ohala J, Hinton L, Nichols J (1994) Sound symbolism. Cambridge University Press, New York
Pennebaker J, Booth R, Francis M (2007) Linguistic inquiry and word count: LIWC2007—operator’s manual. LIWC.net, Austin
Prince S (2007) Probabilistic linear discriminant analysis for inferences about identity. In: IEEE 11th international conference on computer vision (ICCV), pp 1–8
Sadjadi S, Hansen J (2011) Hilbert envelope-based features for robust speaker identification under reverberant mismatched conditions. In: Proceedings of IEEE international conference acoustics, speech and signal processing (ICASSP), pp 5448–5451
Shuttleworth J, and Keith G (2000) Living Language. Hodder Education
Tannen D (1984) Conversational style: analyzing talk among friends. Ablex, Norwood
Tannen D (1994) Gender and discourse. Oxford University Press, Oxford
Walker K, Strassel S (2012) The RATS radio traffic collection system. In: Odyssey 2012—the speaker and language recognition workshop, 25–28 June 2012
Wang W (2011) Automatic detection of speaker attributes based on utterance text. In: Interspeech, Florence, 27–31 August 2011
Whissell C (2009) Using the revised dictionary of affect in language to quantify the emotional undertones of samples of natural language. Psychol Rep 105(1):509–521
Acknowledgments
This material is based on work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract D10PC20024. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of DARPA or its contracting agent, the U.S. Department of the Interior, National Business Center, Acquisition & Property Management Division, Southwest Branch. This material is based on work supported by United States Air Force and DARPA under Contract No. FA8750-13-C-0280. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force and DARPA. Approved for Public Release, Distribution Unlimited.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Lawson, A., Ferrer, L., Wang, W., Murray, J. (2015). Detection of Demographics and Identity in Spontaneous Speech and Writing. In: Baughman, A., Gao, J., Pan, JY., Petrushin, V. (eds) Multimedia Data Mining and Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-14998-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-14998-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14997-4
Online ISBN: 978-3-319-14998-1
eBook Packages: Computer ScienceComputer Science (R0)