Detection of Demographics and Identity in Spontaneous Speech and Writing | SpringerLink
Skip to main content

Detection of Demographics and Identity in Spontaneous Speech and Writing

  • Chapter
  • First Online:
Multimedia Data Mining and Analytics

Abstract

This chapter focuses on the automatic identification of demographic traits and identity in both speech and writing. We address language use in the virtual world of online games and text entry on mobile devices in the form of chat, email and nicknames, and demonstrate text factors that correlate with demographics, such as age, gender, personality, and interaction style. Also presented here is work on speakers identification in spontaneous language use, where we describe the state of the art in verification, feature extraction, modeling and calibration across multiple environmental conditions. Finally, we bring speech and writing together to explore approaches to user authentication that span language in general. We discuss how speech-specific factors such as intonation, and writing-specific features such as spelling, punctuation, and typing correction correlate and predict one another as a function of users’ sociolinguistic characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
JPY 14299
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brümmer N (2006) FoCal II: toolkit for calibration of multi-class recognition scores. Software available at http://www.dsp.sun.ac.za/~nbrummer/focal/index.htm. August 2006

  2. Brümmer N et al (2007) Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006. IEEE Trans Audio Speech Lang Process 15(7):2072–2084

    Article  Google Scholar 

  3. Brümmer N, van Leeuwen D (2006) On calibration of language recognition scores. In: Proceedings of the speaker and language recognition workshop, Puerto Rico, Odyssey

    Google Scholar 

  4. Ching M (1982) The question intonation in assertions. Am Speech 57:95–107

    Article  Google Scholar 

  5. Dehak N et al (2010) Frontend factor analysis for speaker verification. IEEE Trans ASLP 19(4):788–798

    Google Scholar 

  6. Dieterle E, Murray J (2011) Virtual environment real user study: design and methodological considerations and implications. J Appl Learn Technol 1(1):19–25

    Google Scholar 

  7. Ferrer L et al (2010) A unified approach for audio characterization and its application to speaker recognition. In: Proceedings of the speaker and language recognition workshop, Odyssey 2010, Brno

    Google Scholar 

  8. Ferrer L et al (2011) Promoting robustness for speaker modeling in the community: the PRISM evaluation set. In: Proceedings of SRE11 analysis workshop, December 2011

    Google Scholar 

  9. Flach P, Lachiche N (2001) Confirmation-guided discovery of first-order rules with tertius. Mach Learn 42(1–2):61–95

    Article  MATH  Google Scholar 

  10. Garcia-Romero D, Espy-Wilson C (2011) Analysis of i-vector length normalization in speaker recognition systems. In: Proceedings interspeech, Florence

    Google Scholar 

  11. Hall M et al (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18

    Article  Google Scholar 

  12. Herring S (1994) Gender differences in computer-mediated communication: bringing familiar baggage to the new frontier. In: American library association annual convention, Miami

    Google Scholar 

  13. Herring S, Paolillo J (2006) Gender and genre variation in weblogs. J Sociolinguist 10(4):439–459

    Google Scholar 

  14. Kenny P (2010) Bayesian speaker verification with heavy-tailed priors. In: IEEE Odyssey 2010—the speaker and language recognition workshop, 29 June 2010

    Google Scholar 

  15. Kenny P et al (2008) A study of inter-speaker variability in speaker verification. IEEE Trans ASLP 16(5):980–988

    MathSciNet  Google Scholar 

  16. Kim C, Stern R (2010) Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring. In: Proceedings of IEEE international conference on acoustics speech and signal processing (ICASSP), pp 4574–4577

    Google Scholar 

  17. Kim C, Stern R (2012) Power-normalized Cepstral coefficients (PNCC) for robust speech recognition. In: Proceedings of IEEE international conference acoustics, speech and signal processing (ICASSP), Kyoto, 25–30 March 2012

    Google Scholar 

  18. Kockmann M et al (2011) i-Vector fusion of Prosodic and Cepstral features for speaker verification. In: Proceedings of interspeech, Florence

    Google Scholar 

  19. Lakoff R (1975) Language and woman’s place. Harper & Row, New York

    Google Scholar 

  20. Lawson A et al (2012) Sociolinguistic factors and gender mapping across real and virtual world cultures. In: 2nd international conference on cross-cultural decision making, San Francisco, July 2012

    Google Scholar 

  21. Lawson A, Murray J (2014) Identifying user demographic traits through virtual-world language use. In: Ahmad MA, Shen C, Srivastava J, Contractor N (eds) Predicting real world behaviors from virtual world data. Springer, London

    Google Scholar 

  22. Lawson A, Taylor N (2012) The names people play: exploring MMOG players’ Avatar naming conventions. In: Canadian games studies association symposium, May 2012

    Google Scholar 

  23. Lee B, Ellis D (2012) Noise robust pitch tracking by subband autocorrelation classification. In: Proceedings of interspeech, Portland

    Google Scholar 

  24. Lei Y et al (2012) Towards noise-robust speaker recognition using probabilistic linear discriminant analysis. In: Proceedings of IEEE international conference acoustics, speech and signal processing (ICASSP), Kyoto, 25–30 March 2012

    Google Scholar 

  25. Lei Y et al (2014) A novel scheme for speaker recognition using a phonetically-aware deep neural network. In: ICASSP 2014, Florence

    Google Scholar 

  26. Martin A et al (1997) The DET curve in assessment of detection task performance. In: Proceedings Eurospeech, pp 1899–1903

    Google Scholar 

  27. McLaren M et al (2013a) Improving speaker identification robustness to highly channel-degraded speech through multiple system fusion. In: Proceedings of ICASSP, Vancouver

    Google Scholar 

  28. McLaren M et al (2013b) Improving robustness to compressed speech in speaker recognition. In: Proceedings of interspeech, pp 3698–3702

    Google Scholar 

  29. Mitra V et al (2012) Normalized amplitude modulation features for large vocabulary noise-robust speech recognition. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP), Kyoto, 25–30 March 2012

    Google Scholar 

  30. Murray J et al (2012) Virtual environment real user study (verus): final project report. AFRL-RY-WP-TR-2012-0286, Air Force Research Laboratory

    Google Scholar 

  31. NIST SRE12 Evaluation Plan (2012) http://www.nist.gov/itl/iad/mig/upload/NIST_SRE12evalplan-v17--r1.pdf

  32. O’Barr WM, Atkins BK (1980) Women’s language or powerless language? In: McConnell-Ginet S, Borker, N, Thurman R (eds) Women and Language in Literature and Society. Praeger, New York, pp 93–110

    Google Scholar 

  33. Ohala J, Hinton L, Nichols J (1994) Sound symbolism. Cambridge University Press, New York

    Google Scholar 

  34. Pennebaker J, Booth R, Francis M (2007) Linguistic inquiry and word count: LIWC2007—operator’s manual. LIWC.net, Austin

    Google Scholar 

  35. Prince S (2007) Probabilistic linear discriminant analysis for inferences about identity. In: IEEE 11th international conference on computer vision (ICCV), pp 1–8

    Google Scholar 

  36. Sadjadi S, Hansen J (2011) Hilbert envelope-based features for robust speaker identification under reverberant mismatched conditions. In: Proceedings of IEEE international conference acoustics, speech and signal processing (ICASSP), pp 5448–5451

    Google Scholar 

  37. Shuttleworth J, and Keith G (2000) Living Language. Hodder Education

    Google Scholar 

  38. Tannen D (1984) Conversational style: analyzing talk among friends. Ablex, Norwood

    Google Scholar 

  39. Tannen D (1994) Gender and discourse. Oxford University Press, Oxford

    Google Scholar 

  40. Walker K, Strassel S (2012) The RATS radio traffic collection system. In: Odyssey 2012—the speaker and language recognition workshop, 25–28 June 2012

    Google Scholar 

  41. Wang W (2011) Automatic detection of speaker attributes based on utterance text. In: Interspeech, Florence, 27–31 August 2011

    Google Scholar 

  42. Whissell C (2009) Using the revised dictionary of affect in language to quantify the emotional undertones of samples of natural language. Psychol Rep 105(1):509–521

    Article  Google Scholar 

Download references

Acknowledgments

This material is based on work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract D10PC20024. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of DARPA or its contracting agent, the U.S. Department of the Interior, National Business Center, Acquisition & Property Management Division, Southwest Branch. This material is based on work supported by United States Air Force and DARPA under Contract No. FA8750-13-C-0280. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force and DARPA. Approved for Public Release, Distribution Unlimited.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aaron Lawson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Lawson, A., Ferrer, L., Wang, W., Murray, J. (2015). Detection of Demographics and Identity in Spontaneous Speech and Writing. In: Baughman, A., Gao, J., Pan, JY., Petrushin, V. (eds) Multimedia Data Mining and Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-14998-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14998-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14997-4

  • Online ISBN: 978-3-319-14998-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics