Abstract
In this paper, presence of the speaker-specific suprasegmental information in the Linear Prediction (LP) residual signal is demonstrated. The LP residual signal is obtained after removing the predictable part of the speech signal. This information, if added to existing speaker recognition systems based on segmental and subsegmental features, can result in better performing combined system. The speaker-specific suprasegmental information can not only be perceived by listening to the residual, but can also be seen in the form of excitation peaks in the residual waveform. However, the challenge lies in capturing this information from the residual signal. Higher order correlations among samples of the residual are not known to be captured using standard signal processing and statistical techniques. The Hilbert envelope of residual is shown to further enhance the excitation peaks present in the residual signal. A speaker-specific pattern is also observed in the autocorrelation sequence of the Hilbert envelope, and further in the statistics of this autocorrelation sequence. This indicates the presence of the speaker-specific suprasegmental information in the residual signal. In this work, no distinction between voiced and unvoiced sounds is done for extracting these features. Support Vector Machine (SVM) is used to classify the patterns in the variance of the autocorrelation sequence for the speaker recognition task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Furui, S.: Speaker-independent and speakeradaptive recognition techniques. In: Furui, S., Sondhi, M.M. (eds.) Advances in Speech signal processing, pp. 597–622. Marcel Dekker (1991)
Makhoul, J.: Linear Prediction: A Tutorial Review. Proc. IEEE 63(4), 561–580 (1975)
Yegnanarayana, B., Prasanna, S.R.M., Rao, K.S.: Speech Enhancement using Excitation Source Information. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Orlando, FL, USA (May 2002)
Ananthapadmanabha, T.V., Yegnanarayana, B.: Epoch Extraction from Linear Prediction Residual for Identification of Closed Glottis Interval. IEEE Trans. Acoust., Speech, Signal Processing ASSP-27(4), 309–319 (1979)
Yegnanarayana, B., Prasanna, S.R.M., Zachariah, J.M., Gupta, C.S.: Combining Evidence from Source, Suprasegmental and Spectral Features for a Fixed-Text Speaker Verification System. IEEE Trans. Speech and Audio Processing 13(4) (July 2005)
Campbell, J.P.: Speaker recognition: A tutorial. Proc. IEEE 85(9), 1436–1462 (1997)
Bimbot, F., et al.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 4, 430–451 (2004)
Yegnanarayana, B., Reddy, K.S., Kishore, S.P.: Source and System Features for Speaker Recognition using AANN Models. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Saltlake City, Utah, USA (May 2001)
Prasanna, S.R.M., Gupta, C.S., Yegnanarayana, B.: Autoassociative Neural Network Models for Speaker Verification using Source Features. In: Proc. Int. Conf. Cognitive and Neural Systems, Boston, USA (May 2002)
Pruzansky, S.: Pattern-matching procedure for automatic talker recognition. J. Acoust. Soc. Amer. 35, 354–358 (1963)
Li, K.P., et al.: Experimental studies in speaker verification using a adaptive system. J. Acoust. Soc. Amer. 40, 966–978 (1966)
Doddington, G.: A method of speaker verification. J. Acoust. Soc. Amer. 49, 139 (A) (1971)
Li, K.P., Hughes, G.W.: Talker differences as they appear in correlation matrices of continuous speech spectra. J. Acoust. Soc. Amer. 55(4), 833–837 (1974)
Beek, B., et al.: An assessment of the technology of automatic speech recognition for military applications. IEEE Trans. Acoust., Speech, Signal Processing 25, 310–322 (1977)
Sambur, M.R.: Speaker recognition using orthogonal linear prediction. IEEE Trans. Acoust., Speech, Signal Processing 24, 283–289 (1976)
Furui, S., Itakura, F., Satio, S.: Talker recognition by long-time averaged speech spectrum. Electron Commun., Jap. 55-A, 54–61 (1972)
Soong, F.K., Rosenberg, A.E., Rabiner, L.R., Juang, B.H.: A vector quantization approach to speaker recognition. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, pp. 387–390 (1985)
Rosenberg, A.E., Soong, F.K.: Evaluation of a vector quantization talker recognition system in a text independent and text dependent modes. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, pp. 873–876 (1986)
Poritz, A.B.: Linear predictive hidden markov models and the speech signal. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, pp. 1291–1294 (1982)
Reynolds, D.A.: Speaker identification and verification using gaussian mixture models. Speech Comm. 17, 91–108 (1995)
Higgins, A.L., Bahler, L., Porter, J.: Voice identification using nonparametric density matching. In: Lee, C.H., Soong, F.K., Paliwal, K.K. (eds.) Automatic Speech and Speaker Recognition, pp. 211–232. Kluwer Academic, Boston (1996)
Doddington, G.R.: Speaker recognition based on idiolectal differences between speakers. In: Eurospeech, pp. 2521–2524 (2001)
Prasanna, S.R.M., Gupta, C.S., Yegnanarayana, B.: Source Information from Linear Prediction Residual for Speaker Recognition. Communicated to J. Acoust. Soc. Amer. (2002)
Collobert, R., Bengio, S.: Svmtorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research 1, 143–160 (2001)
Haykin, S.: Neural Networks: A Comprehensive Foundation. Macmillan College Publishing Company, New York (1994)
Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, New York (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bajpai, A., Pathangay, V. (2009). Text and Language-Independent Speaker Recognition Using Suprasegmental Features and Support Vector Machines. In: Ranka, S., et al. Contemporary Computing. IC3 2009. Communications in Computer and Information Science, vol 40. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03547-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-03547-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03546-3
Online ISBN: 978-3-642-03547-0
eBook Packages: Computer ScienceComputer Science (R0)