Abstract
In this paper (This is a revised and extended version of the article A Comparison of Polish Taggers in the Application for Automatic Speech Recognition that appeared in the Proceedings of Language and Tools Conference, Poznan, 2013.) we investigate the performance of Polish taggers in the context of automatic speech recognition (ASR). We use a morphosyntactic language model to improve speech recognition in an ASR system and seek the best Polish tagger for our needs. Polish is an inflectional language and an n-gram model using morphosyntactic features, which reduces data sparsity seems to be a good choice. We investigate the difference between the morphosyntactic taggers in that context. We compare the results of tagging with respect to the reduction of word error rate as well as speed of tagging. As it turns out at present the taggers using conditional random fields (CRF) models perform the best in the context of ASR. A broader audience might be also interested in the other discussed features of the taggers such as easiness of installation and usage, which are usually not covered in the papers describing such systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We use the terms part-of-speech and grammatical class interchangeably in this document, due to the way they are used in the literature regarding Polish tagsets and taggers.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
We have not included the results for WMBT since it was impossible to obtain its results when these tests were performed. Moreover its behaviour was the worst in all the other tests, so we have not expected to see any improvement.
References
Acedański, S.: A morphosyntactic brill tagger for inflectional languages. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 3–14. Springer, Heidelberg (2010)
Brants, T.: TnT: a statistical part-of-speech tagger. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 224–231. Association for Computational Linguistics (2000)
Brill, E.: A simple rule-based part of speech tagger. In: Proceedings of the Workshop on Speech and Natural Language, pp. 112–116. Association for Computational Linguistics (1992)
Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18(4), 467–479 (1992)
Daelemans, W., Zavrel, J., van der Sloot, K., van den Bosch, A.: TiMBL: Tilburg Memory-Based Learner (2010)
Daelemans, W., Van den Bosch, A.: Memory-Based Language Processing. Cambridge University Press, New York (2005)
Gauvain, J.L., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 181–184. IEEE (1995)
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Marciniak, M.: Anotowany korpus dialogów telefonicznych. Akademicka Oficyna Wydawnicza EXIT, Warszawa (2011)
Mohri, M., Pereira, F., Riley, M.: Weighted finite-state transducers in speech recognition. Comput. Speech Lang. 16(1), 69–88 (2002)
Piasecki, M.: Polish tagger TaKIPI: Rule based construction and optimisation. Task Q. 11(1–2), 151–167 (2007)
Pohl, A., Ziółko, B.: A comparison of polish taggers in the application for automatic speech recognition. In: Proceedings of the 6th Language & Technology Conference, pp. 294–298 (2013)
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlíček, P., Qian, Y., Schwarz, P., et al.: The kaldi speech recognition toolkit. In: Proceedings of Automatic Speech Recognition and Understanding (2011)
Przepiórkowski, A., Bańko, M., Górski, R.L., Lewandowska-Tomaszczyk, B.: Narodowy Korpus Jȩzyka Polskiego. Wydawnictwo Naukowe PWN, Warsaw (2012)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Radziszewski, A., Śniatowski, T.: A memory-based tagger for Polish. In: Proceedings of the 5th Language & Technology Conference, Poznań, pp. 29–36 (2011)
Radziszewski, A.: A tiered CRF tagger for Polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. SCI, vol. 467, pp. 215–230. Springer, Heidelberg (2013)
Radziszewski, A., Wardyński, A., Śniatowski, T.: WCCL: a morpho-syntactic feature toolkit. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 434–441. Springer, Heidelberg (2011)
Stolcke, A.: SRILM-an extensible language modeling toolkit. In: Proceedings of the International Conference on Spoken Language Processing, vol. 2, pp. 901–904 (2002)
Sutton, C., McCallum, A.: An introduction to conditional random fields for relational learning. In: Introduction to Statistical Relational Learning, pp. 93–128 (2006)
Tufis, D.: Tiered tagging and combined language models classifiers. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 28–33. Springer, Heidelberg (1999)
Vu, N.T., Kraus, F., Schultz, T.: Multilingual a-stabil: a new confidence score for multilingual unsupervised training. In: 2010 IEEE Spoken Language Technology Workshop (SLT), pp. 183–188. IEEE (2010)
Waszczuk, J.: Harnessing the CRF complexity with domain-specific constraints. The case of morphosyntactic tagging of a highly inflected language. In: Kay, M., Boitet, C. (eds.) Proceedings of COLING, pp. 2789–2804 (2012)
Witten, I., Bell, T.: The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression. IEEE Trans. Inf. Theory 37(4), 1085–1094 (1991)
Woliński, M.: Morfeusz—a practical tool for the morphological analysis of Polish. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds.) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol. 35, pp. 511–520. Springer, Heidelberg (2003)
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: HTK Book. Cambridge University Engineering Department, UK (2005)
Żelasko, P., Ziółko, B., Jadczyk, T., Skurzok, D.: AGH corpus of Polish speech. In: Language Resources and Evaluation, pp. 1–17 (2015)
Ziółko, B., Ziółko, M.: Przetwarzanie mowy. Wydawnictwo AGH, Kraków (2011)
Acknowledgement
This work was supported by LIDER/37/69/L-3/11/NCBR/2012 grant.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Smywiński-Pohl, A., Ziółko, B. (2016). A Revised Comparison of Polish Taggers in the Application for Automatic Speech Recognition. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-43808-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43807-8
Online ISBN: 978-3-319-43808-5
eBook Packages: Computer ScienceComputer Science (R0)