{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,12,5]],"date-time":"2024-12-05T05:15:30Z","timestamp":1733375730567,"version":"3.30.1"},"reference-count":20,"publisher":"SAGE Publications","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["KES"],"published-print":{"date-parts":[[2021,11,10]]},"abstract":"In Speech Enhancement (SE) techniques, the major challenging task is to suppress non-stationary noises including white noise in real-time application scenarios. Many techniques have been developed for enhancing the vocal signals; however, those were not effective for suppressing non-stationary noises very well. Also, those have high time and resource consumption. As a result, Sliding Window Empirical Mode Decomposition and Hurst (SWEMDH)-based SE method where the speech signal was decomposed into Intrinsic Mode Functions (IMFs) based on the sliding window and the noise factor in each IMF was chosen based on the Hurst exponent data. Also, the least corrupted IMFs were utilized to restore the vocal signal. However, this technique was not suitable for white noise scenarios. Therefore in this paper, a Variant of Variational Mode Decomposition (VVMD) with SWEMDH technique is proposed to reduce the complexity in real-time applications. The key objective of this proposed SWEMD-VVMDH technique is to decide the IMFs based on Hurst exponent and then apply the VVMD technique to suppress both low- and high-frequency noisy factors from the vocal signals. Originally, the noisy vocal signal is decomposed into many IMFs using SWEMDH technique. Then, Hurst exponent is computed to decide the IMFs with low-frequency noisy factors and Narrow-Band Components (NBC) is computed to decide the IMFs with high-frequency noisy factors. Moreover, VVMD is applied on the addition of all chosen IMF to remove both low- and high-frequency noisy factors. Thus, the speech signal quality is improved under non-stationary noises including additive white Gaussian noise. Finally, the experimental outcomes demonstrate the significant speech signal improvement under both non-stationary and white noise surroundings.<\/jats:p>","DOI":"10.3233\/kes-210072","type":"journal-article","created":{"date-parts":[[2021,11,16]],"date-time":"2021-11-16T17:26:29Z","timestamp":1637083589000},"page":"299-308","source":"Crossref","is-referenced-by-count":0,"title":["A variant of SWEMDH technique based on variational mode decomposition for speech enhancement"],"prefix":"10.1177","volume":"25","author":[{"given":"Poovarasan","family":"Selvaraj","sequence":"first","affiliation":[]},{"given":"E.","family":"Chandra","sequence":"additional","affiliation":[]}],"member":"179","reference":[{"key":"10.3233\/KES-210072_ref1","doi-asserted-by":"crossref","unstructured":"D.S. Kulkarni et al., A review of speech signal enhancement techniques, International Journal of Computer Applications 139(14) (2016).","DOI":"10.5120\/ijca2016909507"},{"issue":"2","key":"10.3233\/KES-210072_ref2","first-page":"527","article-title":"A unified approach to speech enhancement and voice activity detection","volume":"21","author":"Kasap","year":"2013","journal-title":"Turkish Journal of Electrical Engineering & Computer Sciences"},{"key":"10.3233\/KES-210072_ref3","doi-asserted-by":"crossref","unstructured":"Y. Zhang et al., A hierarchical framework approach for voice activity detection and speech enhancement, The Scientific World Journal, (2014).","DOI":"10.1155\/2014\/723643"},{"issue":"4","key":"10.3233\/KES-210072_ref4","doi-asserted-by":"crossref","first-page":"670","DOI":"10.1109\/TASLP.2015.2401426","article-title":"Robust estimation of non-stationary noise power spectrum for speech enhancement","volume":"23","author":"Mai","year":"2015","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"4","key":"10.3233\/KES-210072_ref5","doi-asserted-by":"crossref","first-page":"1080","DOI":"10.1016\/j.compeleceng.2013.12.007","article-title":"Speech enhancement method based on sparse reconstruction of power spectral density","volume":"40","author":"Zhao","year":"2014","journal-title":"Computers & Electrical Engineering"},{"issue":"3","key":"10.3233\/KES-210072_ref6","doi-asserted-by":"crossref","first-page":"EL228","DOI":"10.1121\/1.4977098","article-title":"Decision-directed speech power spectral density matrix estimation for multichannel speech enhancement","volume":"141","author":"Jin","year":"2017","journal-title":"The Journal of the Acoustical Society of America"},{"key":"10.3233\/KES-210072_ref7","first-page":"5039","article-title":"Time-frequency masking-based speech enhancement using generative adversarial network","author":"Soni","journal-title":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)"},{"key":"10.3233\/KES-210072_ref8","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.dsp.2014.06.006","article-title":"Detrended fluctuation thresholding for empirical mode decomposition based denoising","volume":"32","author":"Mert","year":"2014","journal-title":"Digital Signal Processing"},{"issue":"6","key":"10.3233\/KES-210072_ref9","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1109\/MSP.2013.2267931","article-title":"Empirical mode decomposition-based time-frequency analysis of multivariate signals: The power of adaptive data analysis","volume":"30","author":"Mandic","year":"2013","journal-title":"IEEE Signal Processing Magazine"},{"issue":"5","key":"10.3233\/KES-210072_ref10","doi-asserted-by":"crossref","first-page":"899","DOI":"10.1109\/TASLP.2014.2312541","article-title":"Speech enhancement with EMD and hurst-based mode selection","volume":"22","author":"Zao","year":"2014","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"3","key":"10.3233\/KES-210072_ref11","first-page":"429","article-title":"Speech enhancement using sliding window empirical mode decomposition and hurst-based technique","volume":"44","author":"Poovarasan","year":"2019","journal-title":"Archives of Acoustics"},{"key":"10.3233\/KES-210072_ref12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.specom.2015.02.007","article-title":"Speech enhancement by noise driven adaptation of perceptual scales and thresholds of continuous wavelet transform coefficients","volume":"70","author":"Swami","year":"2015","journal-title":"Speech Communication"},{"issue":"1","key":"10.3233\/KES-210072_ref13","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/LSP.2015.2495102","article-title":"Speech enhancement with nonstationary acoustic noise detection in time domain","volume":"23","author":"Tavares","year":"2016","journal-title":"IEEE Signal Processing Letters"},{"issue":"1","key":"10.3233\/KES-210072_ref14","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1186\/s13636-017-0122-4","article-title":"Robust noise power spectral density estimation for binaural speech enhancement in time-varying diffuse noise field","volume":"2017","author":"Ji","year":"2017","journal-title":"EURASIP Journal on Audio, Speech, and Music Processing"},{"issue":"5","key":"10.3233\/KES-210072_ref15","doi-asserted-by":"crossref","first-page":"1912","DOI":"10.1007\/s00034-016-0384-6","article-title":"Sparse representations for single channel speech enhancement based on voiced\/unvoiced classification","volume":"36","author":"Messaoud","year":"2017","journal-title":"Circuits, Systems, and Signal Processing"},{"key":"10.3233\/KES-210072_ref16","unstructured":"O. Ghahabi et al., A robust voice activity detection for real-time automatic speech recognition, in: Proceedings of ESSV 2018, 2018."},{"key":"10.3233\/KES-210072_ref17","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/s13636-018-0135-7","article-title":"Enhancement of speech dynamics for voice activity detection using DNN","volume":"1","author":"Dwijayanti","year":"2018","journal-title":"EURASIP Journal on Audio, Speech, and Music Processing"},{"issue":"2","key":"10.3233\/KES-210072_ref18","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1007\/s10772-018-9500-2","article-title":"Low rank sparse decomposition model based speech enhancement using gammatone filter bank and kullback\u2013leibler divergence","volume":"21","author":"Saleem","year":"2018","journal-title":"International Journal of Speech Technology"},{"key":"10.3233\/KES-210072_ref19","first-page":"93","article-title":"DARPA TIMIT acoustic-phonetic continuous speech corpus CD-ROM. NIST speech disc 1\u20131.1","author":"Garofolo","year":"1993","journal-title":"NASA STI\/Recon Technical Report N"},{"issue":"3","key":"10.3233\/KES-210072_ref20","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1016\/0167-6393(93)90095-3","article-title":"Assessment for automatic speech recognition: II. NOISEX-92: A Database and an experiment to study the effect of additive noise on speech recognition systems","volume":"12","author":"Varga","year":"1993","journal-title":"Speech Communication"}],"container-title":["International Journal of Knowledge-based and Intelligent Engineering Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/KES-210072","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,4]],"date-time":"2024-12-04T07:04:59Z","timestamp":1733295899000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/KES-210072"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,10]]},"references-count":20,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.3233\/kes-210072","relation":{},"ISSN":["1327-2314","1875-8827"],"issn-type":[{"type":"print","value":"1327-2314"},{"type":"electronic","value":"1875-8827"}],"subject":[],"published":{"date-parts":[[2021,11,10]]}}}