ISCA Archive - Brazilian portuguese acoustic model training based on data borrowing from other language
ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Brazilian portuguese acoustic model training based on data borrowing from other language

Kazuhiko Abe, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura

This paper presents the acoustic modeling method for Portuguese speech recognizers. To improve the acoustic model, other language data are used to offset the lack of the model training data. In using this data-borrowing approach, we select training data with consideration given to the influence of the other language. A simple solution is to minimize the volume of data borrowed. We developed a data selection strategy based on two principles: the Phonetic Frequency Principle and Maximum Entropy Principle. Refining the acoustic model with this strategy, word accuracy is improved, especially words that contain a low-frequency phoneme.