Language Models with RNNs for Rescoring Hypotheses of Russian ASR | SpringerLink
Skip to main content

Language Models with RNNs for Rescoring Hypotheses of Russian ASR

  • Conference paper
  • First Online:
Advances in Neural Networks – ISNN 2016 (ISNN 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9719))

Included in the following conference series:

Abstract

In this paper, we describe a research of recurrent neural networks (RNNs) for language modeling in large vocabulary continuous speech recognition for Russian. We experimented with recurrent neural networks with different number of units in the hidden layer. RNN-based and 3-gram language models (LMs) were trained using the text corpus of 350M words. Obtained RNN-based language models were used for N-best list rescoring for automatic continuous Russian speech recognition. We tested also a linear interpolation of RNN LMs with the baseline 3-gram LM and achieved 14 % relative reduction of the word error rate (WER) with respect to the baseline 3-gram model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Schwenk, H., Gauvain, J.-L.: Training neural network language models on very large corpora. In: Proceedings of the Conference on Empirical Methods on Natural Language Processing. Association for Computational Linguistics, Vancouver, B.C., Canada, pp. 201–208 (2005)

    Google Scholar 

  2. Mikolov, T., Karafiat, M., Burget, L., Cernocky, J., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of INTERSPEECH 2010, vol. 2, pp. 1045–1048. Makuhari, Chiba, Japan (2010)

    Google Scholar 

  3. Sundermeyer, M., Oparin, I., Gauvain, J.-L., Freiberg, B., Schluter, R., Ney, H.: Comparison of feedforward and recurrent neural network language models. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, B.C., Canada, pp. 8430–8434 (2013)

    Google Scholar 

  4. Mikolov, T., Deoras, A., Povey, D., Burget L., Černocký, J.: Strategies for training large scale neural network language models. In: Proceedings of ASRU 2011, Hawaii, pp. 196–201 (2011)

    Google Scholar 

  5. Huang, Z., Zweig, G., Dumoulin, B.: Cache based recurrent neural network language model inference for first pass speech recognition. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014, Florence, Italy, pp. 6404–6408 (2014)

    Google Scholar 

  6. Shi, Y., Larson, M., Wiggers, P., Jonker, C.M.: Exploiting the succeeding words in recurrent neural network. In: Proceedings of INTERSPEECH 2013, Lyon, France, pp. 632–636 (2013)

    Google Scholar 

  7. Chen, X., Liu, X., Gales, M.J.F., Woodland, P.C.: Recurrent neural network language model training with noise contrastive estimation for speech recognition. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, pp. 5411–5415 (2015)

    Google Scholar 

  8. Morioka, T., Iwata, T., Hori, T., Kobayashi, T.: Multiscale recurrent neural network based language model. In: Proceedings of INTERSPEECH 2015, Dresden, Germany, pp. 2366–2370 (2015)

    Google Scholar 

  9. Liu, X., Chen, X., Gales, M.J.F., Woodland, P.C.: Paraphrastic recurrent neural network language models. In: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, pp. 5406–5410 (2015)

    Google Scholar 

  10. Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Proceedings of INTERSPEECH 2012, pp. 194–197 (2012)

    Google Scholar 

  11. Vu, N.T., Schultz, T.: Exploration of the impact of maximum entropy in recurrent neural network language models for code-switching speech. In: Proceedings of 1st Workshop on Computational Approaches to Code Switching, Doha, Qatar, pp. 34–41 (2014)

    Google Scholar 

  12. Tachioka, Y., Watanabe, S.: Discriminative method for recurrent neural network language models. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, pp. 5386–5390 (2015)

    Google Scholar 

  13. Vazhenina, D., Markov, K.: Evaluation of advanced language modeling techniques for Russian LVCSR. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 124–131. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  14. Mikolov, T., Kombrink, S., Burget, L., Černocký, J.H., Khudanpur, S.: Extensions of recurrent neural network language model. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5528–5531 (2011)

    Google Scholar 

  15. Karpov, A., Markov, K., Kipyatkova, I., Vazhenina, D., Ronzhin, A.: Large vocabulary Russian speech recognition using syntactico-statistical language modeling. Speech Commun. 56, 213–228 (2014)

    Article  Google Scholar 

  16. Stolcke, A., Zheng, J., Wang, W., Abrash, V.: SRILM at sixteen: update and outlook. In: Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop ASRU 2011, Waikoloa, Hawaii, USA (2011)

    Google Scholar 

  17. Kipyatkova, I., Karpov, A.: Lexicon size and language model order optimization for Russian LVCSR. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 219–226. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  18. Mikolov, T., Kombrink, S., Deoras, A., Burget, L., Černocký, J.: RNNLM-recurrent neural network language modeling toolkit. In: ASRU-2011, Demo Session (2011)

    Google Scholar 

  19. Kipyatkova, I., Karpov, A.: A comparison of RNN LM and FLM for Russian speech recognition. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 42–50. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  20. Kipyatkova, I., Karpov, A.: Recurrent neural network-based language modeling for an automatic Russian speech recognition system. In: Proceedings of International Conference AINL-ISMW FRUCT, St. Petersburg, Russia, pp. 33–38 (2015)

    Google Scholar 

  21. Moore, G.L.: Adaptive Statistical Class-Based Language Modelling. Ph.D. thesis, Cambridge University (2001)

    Google Scholar 

  22. Jokisch, O., Wagner, A., Sabo, R., Jaeckel, R., Cylwik, N., Rusko, M., Ronzhin, A., Hoffmann, R.: Multilingual speech data collection for the assessment of pronunciation and prosody in a language learning system. In: Proceedings of SPECOM 2009, St. Petersburg, Russia, pp. 515–520 (2009)

    Google Scholar 

  23. Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine julius. In: Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2009, Sapporo, Japan, pp. 131–137 (2009)

    Google Scholar 

Download references

Acknowledgments

This research is partially supported by the Council for Grants of the President of Russia (projects No. MK-5209.2015.8 and MD-3035.2015.8), by the Russian Foundation for Basic Research (projects No. 15-07-04415 and 15-07-04322), and by the Government of the Russian Federation (grant No. 074-U01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irina Kipyatkova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kipyatkova, I., Karpov, A. (2016). Language Models with RNNs for Rescoring Hypotheses of Russian ASR. In: Cheng, L., Liu, Q., Ronzhin, A. (eds) Advances in Neural Networks – ISNN 2016. ISNN 2016. Lecture Notes in Computer Science(), vol 9719. Springer, Cham. https://doi.org/10.1007/978-3-319-40663-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40663-3_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40662-6

  • Online ISBN: 978-3-319-40663-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics