Abstract
The sinusoidal model has proven useful for representation and modification of speech and audio signal. One drawback, however, is that a sinusoidal model is typically derived using a fixed analysis frame size. It cannot guarantee an optimal spectral resolution to each sinusoidal parameter. In this paper, we propose a sinusoidal model using wavelet packet analysis, to obtain better frequency resolution at low frequencies and better time resolution at high frequencies and to estimate the sinusoidal parameters more accurately. Experiments show that the proposed model can achieve better performance than conventional model.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
McAulay, R.J., Quatieri, T.F.: Speech Analysis/Synthesis Based on Sinusoidal Representation. IEEE Trans. on ASSP 34, 744–754 (1986)
Furui, S., Sondhi, M.M.: Advances in Speech Signal Processing. Dekker Inc., NY (1992)
George, E.B., Smith, M.J.T.: Speech Analysis/Synthesis and Modification Using an Analysis-by-Syntehsis/Overlap-Add SinusoidalModel. IEEE Trans. on ASSP 5, 389–406 (1997)
Kleijn, W.B., Paliwal, K.K.: Speech Coding and Synthesis. Elsevier, Amsterdam (1995)
Quatieri, T.F., Daisewicz, R.G.: An Approach to Co-Channel Talker Interference Suppression Using a Sinusoidal Model for Speech. IEEE Trans. on ASSP 38, 56–69 (1990)
Anderson, D.V.: Speech Analysis and Coding Using a Multi-Resolution Sinusoidal Transform. IEEE ICASSP, 1037–1040 (1996)
Kim, K.H., Hwang, I.H.: A Multi-Resolution Sinusoidal Model Using Adaptive Analysis Frame. EURASIP EUSIPCO, 2267–2270 (2004)
Goodwin, M.: Multiresolution Sinusoidal Modeling Using Adaptive Segmentation. IEEE ICASSP, 1525–1528 (1998)
Goodwin, M., Vetterli, M.: Time-Frequency Models for Music Analysis, Transformation, and Synthesis. In: Time-Frequency Time-Scale Symposium (1996)
Coifman, R., Meyer, Y., Quake, Y.S., Wickerhauser, V.: Signal Processing and Compression with Wavelet Packets. Numerical Algorithms Research Group (1990)
Herley, C., Vetterli, M.: Orthogonal Time-Varying Filter Banks and Wavelet Packets. IEEE Trans. on. Signal Processing 42, 2650–2663 (1994)
ITU-T Rec.: P.862, Perceptual Evaluation of Speech Quality (PESQ) an Objective Assessment of Narrowband Telephone Networks and Speech Code (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, K., Hong, J., Lim, J. (2005). Sinusoidal Modeling Using Wavelet Packet Transform Applied to the Analysis and Synthesis of Speech Signals. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_31
Download citation
DOI: https://doi.org/10.1007/11551874_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)