Sociolinguistic Extension of the ORD Corpus of Russian Everyday Speech | SpringerLink
Skip to main content

Sociolinguistic Extension of the ORD Corpus of Russian Everyday Speech

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2016)

Abstract

The ORD corpus is one of the largest resources of contemporary spoken Russian. By 2014, its collection numbered about 400 h of recordings made by a group of 40 respondents (20 men and 20 women, of different ages and professions), who volunteered to spend a whole day with a switched-on voice recorder, recording all their verbal communication. The corpus presents the unique linguistic material recorded in natural communicative situations, allowing spoken Russian and the everyday discourse to be studied in many aspects. However, the original sample of respondents was not sufficient enough to study a sociolinguistic variation of speech. Thus, it was decided to launch a large project aiming at the ORD sociolinguistic extension, which was supported by the Russian Science Foundation. The paper describes the general principles for the sociolinguistic extension of the corpus. It defines social groups which should be presented in the corpus in adequate numbers, sets criteria for selecting participants, describes the “recorder’s kit” for the respondents and involves the adaptation principles of the ORD annotation and structure. Now, the ORD collection exceeds 1200 h of recordings, presenting speech of 127 respondents and hundreds of their interlocutors. 2450 macro episodes of everyday spoken communication have been already annotated, and the speech transcripts add up to 1 mln words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kendall, T.: Corpora from a sociolinguistic perspective. In: Corpus Studies: Future Directions, Special Iss. of Revista Brasileira de Linguística Aplicada, vol. 11(2), pp. 361–389 (2011)

    Google Scholar 

  2. Baker, P.: Sociolinguistics and Corpus Linguistics. Edinburgh University Press, Edinburgh (2010)

    Google Scholar 

  3. Romaine, S.: Corpus linguistics and sociolinguistics. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics: An International Handbook, vol. 1, pp. 96–111. Mouton de Gruyter, Berlin-New York (2008)

    Google Scholar 

  4. Grishina, E.A.: Spoken speech in the Russian national corpus. In: The Russian National Corpus 2003–2005, pp. 94–110. Indrik Publ., Moscow (2005). (in Russian)

    Google Scholar 

  5. Kibrik, A.A., Podlesskaya, V.I. (eds.): Night Dream Stories: a Corpus Study of Spoken Russian Discourse. Languages of Slavic Cultures, Moscow (2009). (in Russian)

    Google Scholar 

  6. Asinovsky, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., Sherstinova, T.: The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 250–257. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  7. Reference Guide for the British National Corpus. http://www.natcorp.ox.ac.uk/docs/URG.xml

  8. Campbell, N.: Speech & expression; the value of a longitudinal corpus. In: LREC 2004, pp. 183–186 (2004)

    Google Scholar 

  9. Linguistic Annotator ELAN. https://tla.mpi.nl/tools/tla-tools/elan/

  10. Praat: doing phonetics by computer. http://www.fon.hum.uva.nl/praat/

  11. Bogdanova-Beglarian, N., Martynenko, G., Sherstinova, T.: The “One Day of Speech” corpus: phonetic and syntactic studies of everyday spoken Russian. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 429–437. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  12. Baeva, E.M.: On means of sociolingiustic balancing of a spoken corpus (Based on the ORD corpus). Perm Univ. Herald Russ. Foreign Philol. 4(28), 48–57 (2014). (in Russian)

    Google Scholar 

  13. Davis, J.M., Smith, M.: Working in Multi-Professional Contexts: A Practical Guide for Professionals in Children’s Services, p. 82. SAGE Publications Ltd., Los Angeles (2012)

    Google Scholar 

  14. Bogdanova-Beglarian, N.V. (ed.): Speech Corpus as the Base for Analysis of Russian Speech. Part 2. Theoretical and practical aspects of analysis, 1. Philological Faculty of St. Petersburg State University, St. Petersburg (2014). (in Russian)

    Google Scholar 

  15. Social and demographic portrait of Russia: the result of population census of 2010 by Federal Agency of Urban Statistics. Statistics of Russia, Moscow (2012). (in Russian)

    Google Scholar 

  16. Zaslavskaya, T.I.: Social structure of modern Russian society. Soc. Sci. Modernity 2, 5–23 (1997). (in Russian)

    Google Scholar 

  17. Sherstinova, T.: The structure of the ORD speech corpus of Russian everyday communication. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 258–265. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  18. Sherstinova, T.: Macro episodes of Russian everyday oral communication: towards pragmatic annotation of the ORD speech corpus. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS (LNAI), vol. 9319, pp. 268–276. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

Download references

Acknowledgement

The research is supported by the Russian Science Foundation, project # 14-18-02070 “Everyday Russian Language in Different Social Groups”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tatiana Sherstinova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bogdanova-Beglarian, N. et al. (2016). Sociolinguistic Extension of the ORD Corpus of Russian Everyday Speech. In: Ronzhin, A., Potapova, R., Németh, G. (eds) Speech and Computer. SPECOM 2016. Lecture Notes in Computer Science(), vol 9811. Springer, Cham. https://doi.org/10.1007/978-3-319-43958-7_80

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43958-7_80

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43957-0

  • Online ISBN: 978-3-319-43958-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics