Abstract
In this paper, a wavelet packet based speech enhancement system is proposed for noise reduction. In the proposed method, a modulation channel selection is used as a thresholding function for de-noising. Three levels 8 sub-band wavelet packet decomposition is used and all sub-bands are given to threshold function for noise suppression. This novel modulation channel selection is based on calculation of true signal-to-noise ratio (SNR) by thresholding with local SNR of −7 dB. The presented method is used for noise suppression in single-channel speech patterns. Objective and subjective parameters are used for performance evaluation of this method. The performance of the proposed method is also compared with spectral subtraction, mband, mmse, test-psc, idbm, klt, and pklt. The proposed method give maximum intelligibility and quality in compared to other given methods. MATLAB 7.14 is used for simulation.








Similar content being viewed by others
References
Stark, A. P. et al. (2008). Noise driven short-time phase spectrum compensation procedure for speech enhancement. In Proceedings of Interspeech, Brisbane Australia.
Berouti, M., Schwartz, M., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 208–211).
Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.
Cohen, I. (2002). Optimal speech enhancement under signal presence uncertainty using log-spectra amplitude estimator. IEEE Signal Processing Letters, 9(4), 113–116.
Ephraim, Y. (1992). Statistical-model-based speech enhancement systems. Proceedings of the IEEE, 80, 1526–1555.
Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.
Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 23(2), 443–445.
Ephraim, Y., & Van, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 3(4), 251–266.
Gustafsson, H., Nordholm, S., & Claesson, I. (2001). Spectral sub-traction using reduced delay convolution and adaptive averaging. IEEE Transactions on Acoustics, Speech, and Signal Processing, 9(8), 799–807.
Jia, H., Ren, Y., & Xueying, Z. (2013). An improved wavelet packet threshold function for speech enhancement method. Journal of Information & Computational Science, 10(3), 941–948.
Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Acoustics, Speech, and Signal Processing, 11, 334–341.
Hu, Y., & Loizou, P. C. (2004). Incorporating a psychoacoustical model in frequency domain speech enhancement. IEEE Signal Processing Letters, 11(2), 270–273.
Hu, Y., & Loizou, P. C. (2004). Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Transactions on Acoustics, Speech, and Signal Processing, 12(1), 59–67.
Hu, Y., & Loizou, P. C. (2006). Evaluation of objective measures for speech enhancement. In Proceedings of the Interspeech.
ITU. (2000). Perceptual evaluation of speech quality (PESQ) and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codes. ITU-T Recommendation, 862.
Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 11(6), 700–708.
Johnson, M. T., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 2(49), 123–133.
Prahallad, K., Kumar, E. N., Keri, V., Rajendran, S., & Black, A. W. (2012). Interspeech-2012. http://speech.iiit.ac.in/index.php/research-svl/69.html.
Kamath, S., & Loizou, P. C. (2002). A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In Proceedings of the IEEE international conference on acoustics, speech, signal processing.
Klatt, D. (1982). Prediction of perceived phonetic distance from critical band spectra. Processing of the IEEE International Conference on Acoustics, Speech, Signal Processing, 7, 1278–1281.
Lim, J., & Oppenheim, A. V. (1978). All-pole modeling of degraded speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(3), 197–210.
Jie, L., & Liu, H. (2012). New wavelet packet transform algorithm based on critical bandwidth. Computer Engineering and Applications, 14(48), 5–7.
Loizou, P. C. (2005). Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum. IEEE Transactions on Acoustics, Speech, and Signal Processing, 13(5), 857–869.
McAulay, R., & Malpass, M. (1980). Speech enhancement using soft-decision noise suppression filter. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(2), 137–145.
Li, R., Bao, C., & Xia, B. (2012) Speech enhancement using the combination of adaptive wavelet threshold and spectral subtraction based on wavelet packet decomposition. In ICSP Proceedings.
Scalart, P., & Filho, J. (1996). Speech enhancement based on a priori signal to noise estimation. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 629–632).
Singh, S., et al. (2016). A wavelet based transform method for quality improvement in noisy speech patterns of Arabic language. International Journal of Speech Technology, 20(4), 609–617.
Li, S., et al. (2013). Enhancement of non-air conducted speech based on wavelet-packet adaptive threshold. Telkomnika, 11(1), 130–135.
Varga, A. P., & Steeneken, H. M. (1992). Technical report, DRA Speech Research Unit. http://www.speech.cs.cmu.edu/comp.speech/Secton1/Data/noisex.html.
Sanam, T. F., & Shahnaz, C. (2012). Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. International Journal of Speech Technology, 15(4), 463–475.
Zhi, T. A. O., He-Ming, Z., & Xiao-Jun, Z. (2011). Speech enhancement based on the multi-scales and multi-thresholds of the auditory perception wavelet transform. Archives of Acoustics, 36(3), 519–532.
Tribolet, J., Noll, P., & McDermott, B. (1978). A study of complexity and quality of speech waveform coders. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 586–590).
Kamil, W., & Loizou, P. C. (2012). Channel selection in the modulation domain for improved speech intelligibility in noise. The Journal of the Acoustical Society of America, 131(4), 2904–2913.
Zhang, X. (2010). Digital speech signal processing and MATLAB simulation. Beijing: Publishing House of Electronics Industry, Beijing Inc.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Singh, S., Tripathy, M. & Anand, R.S. A Wavelet Packet Based Approach for Speech Enhancement Using Modulation Channel Selection. Wireless Pers Commun 95, 4441–4456 (2017). https://doi.org/10.1007/s11277-017-4094-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-017-4094-6