Abstract
This paper takes an investigation into building and fusing multiple phone recognizers in the phonotactic system for language recognition. The phone recognizers are built using both phonetic and acoustic diversification. The phonetic diversification is achieved by training multiple phone recognizers on speech corpus of different languages. While the acoustic diversification is implemented in several ways, including using different acoustic features, different phone modeling techniques and training paradigms. As some phone recognizers are highly correlated with each other, we propose a performance optimization (PO) criterion to select a set of complementary phone recognizers for fusion. Experimental results on the NIST 2007 Language Recognition Evaluation (LRE) 30-s test set show the effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Torres-Carrasquillo, P.A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A., Deller, J.R.: Approaches to Language Identification Using Gaussian Mixture Models and Shifted Delta Cepstral Features. In: Proceedings of ICSLP 2002, pp. 33–36 (2002)
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-Carrasquillo, P.A.: Support Vector Machines for Speaker and Language Recognition. Computer, Speech and Language 20(2-3), 210–229 (2006)
Zissman, M.A.: Comparison of Four Approaches to Automatic Language Identification of Telephone Speech. IEEE Transactions on Speech and Audio Processing 4(1), 31–44 (1996)
Navratil, J.: Recent Advances in Phonotactic Language Recognition Using Binary Decision Trees. In: Proceedings of ICSLP 2006, pp. 421–424 (2006)
Li, H., Ma, B., Lee, C.-H.: A Vector Space Modeeling Approach to Spoken Language Identification. IEEE Transactions on Audio, Speech and Language Processing 15(1), 271–284 (2007)
Torres-Carrasquillo, P.A., Singer, E., Campbell, W., Gleason, T., McCree, A., Reynolds, D.A., Richardson, F., Shen, W., Sturim, D.: The MITLL NIST LRE 2007 Language Recognition System. In: Proceedings of Interspeech 2008, pp. 719–722 (2008)
Matejka, P., Burget, L., et al.: BUT Language Recognition System for NIST 2007 Evaluations. In: Proceedings of Interspeech 2008, pp. 739–742 (2008)
Gauvain, J.L., Messaoudi, A., Schwenk, H.: Language Recognition Using Phone Lattices. In: Proceedings of ICSLP 2004, pp. 1283–1286 (2004)
Sim, K.C., Li, H.: On Acoustic Diversification Front-end for Spoken Language Identification. IEEE Transactions on Audio, Speech and Language Processing 16(5), 1029–1037 (2008)
Deng, Y., Zhang, W.-Q., Liu, J.: Language Recognition Based on Discriminative Vector Space Model. Journal of Nanjing University of Science and Technology 33(sup.1), 138–144 (2009)
Matejka, P., Schwarz, P., Cernocky, J., Chytil, P.: Phonotactic Language Identification Using High Quality Phoneme Recognition. In: Proceedings of Eurospeech 2005, pp. 2237–2240 (2005)
Povey, D., Kingsbury, B., Mangu, L., Saon, G., Soltau, H., Zweig, G.: fMPE: Discriminatively Trained Features for Speech Recognition. In: Proceedings of ICASSP 2005, pp. 961–964 (2005)
Hou, T., Liu, J.: Vector Angle Minimum Criteria for Classifier Selection in Speaker Verification Technology. Journal of Chinese Electronics 19(1), 81–85 (2010)
Brummer, N., Leeuwen, D.: On Calibration of Language Recognition Scores. In: Proceedings of IEEE Odyssey—Speaker Language Recognition Workshop, pp. 1–8 (2006)
BenZeghiba, M.F., Gauvain, J.L., Lamel, L.: Context-Dependent Phone Models and Models Adaptation for Phonotactic Language Recognition. In: Proceedings of Interspeech 2008, pp. 313–316 (2008)
Schluter, R., Muller, B., Wessel, F., Ney, H.: Interdependence of Language Model and Discriminative Training. In: Proceedings of IEEE ASRU Workshop, pp. 119–122 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Deng, Y., Zhang, W., Qian, Y., Liu, J. (2010). Integration of Complementary Phone Recognizers for Phonotactic Language Recognition. In: Zhu, R., Zhang, Y., Liu, B., Liu, C. (eds) Information Computing and Applications. ICICA 2010. Lecture Notes in Computer Science, vol 6377. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16167-4_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-16167-4_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16166-7
Online ISBN: 978-3-642-16167-4
eBook Packages: Computer ScienceComputer Science (R0)