Abstract
This paper proposes a new mapping method combining GMM and codebook mapping methods to transform spectral envelope for voice conversion system. After analyzing overly smoothing problem of GMM mapping method in detail, we propose to convert the basic spectral envelope by GMM method and convert envelope-subtracted spectral details by GMM and phone-tied codebook mapping method. Objective evaluations based on performance indices show that the performance of proposed mapping method averagely improves 27.2017% than GMM mapping method, and listening tests prove that the proposed method can effectively reduce over smoothing problem of GMM method while it can avoid the discontinuity problem of codebook mapping method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Moulines, E., Sagisaka, Y.: Voice conversion: State of the art and perspectives. Speech Communication 16(2), 125–126 (1995)
Arslan, L.M., Talkin, D.: Voice Conversion by Codebook Mapping of Line Spectral Frequencies and Excitation Spectrum. In: Proc. of the Eurospeech 1997, Rhodes, Greece (1997)
Shuang, Z.-W., Wang, Z.-X., Ling, Z.-H., Wang, R.-H.: A novel voice conversion system based on codebook mapping with phoneme-tied weighting. In: Proc. ICSLP, Jeju (October 2004)
Stylianou, Y., et al.: Continuous probabilistic transform for voice conversion. IEEE Transactions on Speech and Audio Processing 6(2), 131–142 (1998)
Kain, A.B.: High Resolution Voice Transformation, Ph.D. thesis, Oregon Health and Science University (October 2001)
Toda, T., Saruwatari, H., Shikano, K.: Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum. In: Proc. of ICASSP, pp. 841–944 (2001)
Chen, Y., Chu, M., et al.: Voice conversion with smoothed gmm and map adaptation. In: Proc. Eurospeech, Geneva, Switzerland, September 2003, pp. 2413–2416 (2003)
Valbret, H., et al.: Voice transformation using PSOLA technique. Speech Communication 11(2-3), 175–187 (1992)
Narendranath, M., et al.: Transformation of formants for voice conversion using artificial neural networks. Speech Communication 16(2), 207–216 (1995)
Watanabe, T., et al.: Transformation of Spectral Envelope for Voice Conversion Based on Radial Basis Function Networks. In: Proc. ICSLP 2002, Denver, USA, September 2002, pp. 285–288.
Kim, E.K., et al.: Hidden Markov Model Based Voice Conversion Using Dynamic Characteristics of Speaker. In: Proc. Eurospeech, Rhodes, Greece, pp. 2519–2522 (1997)
Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice conversion through vector quantization. J. Acoust. Soc. Jpn (E) 11(2), 71–76 (1990)
Toda, T., Black, A.W., Tokuda, K.: pectral conversion based on maximum likelihood estimation considering global variance of converted parameter. In: Proc. Of ICASSP (2005)
Klabbers, E., Veldhuis, R.: Reducing Audible Spectral Discontinuities. IEEE Transactions on Speech and Audio Processing 9(1), 39–51 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kang, Y., Shuang, Z., Tao, J., Zhang, W., Xu, B. (2005). A Hybrid GMM and Codebook Mapping Method for Spectral Conversion. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_39
Download citation
DOI: https://doi.org/10.1007/11573548_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29621-8
Online ISBN: 978-3-540-32273-3
eBook Packages: Computer ScienceComputer Science (R0)