{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,4]],"date-time":"2023-10-04T15:10:30Z","timestamp":1696432230713},"reference-count":19,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2013,1,12]],"date-time":"2013-01-12T00:00:00Z","timestamp":1357948800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J AUDIO SPEECH MUSIC PROC."],"published-print":{"date-parts":[[2013,12]]},"abstract":"Abstract<\/jats:title>\n A lot of effort has been made in Computational Auditory Scene Analysis (CASA) to segregate target speech from monaural mixtures. Based on the principle of CASA, this article proposes an improved algorithm for monaural speech segregation. To extract the energy feature more accurately, the proposed algorithm improves the threshold selection for response energy in initial segmentation stage. Since the resulting mask map often contains broken auditory element groups after grouping stage, a smoothing stage is proposed based on morphological image processing. Through the combination of erosion and dilation operations, we suppress the intrusions by removing the unwanted particles and enhance the segregated speech by complementing the broken auditory elements. Systematic evaluation shows that the proposed segregation algorithm improves the output signal-to-noise ratio by an average of 8.55 dB and cuts the percentage of noise residue by an average of 25.36% compared with the mixture, yielding a significant improvement for speech segregation.<\/jats:p>","DOI":"10.1186\/1687-4722-2013-2","type":"journal-article","created":{"date-parts":[[2013,1,12]],"date-time":"2013-01-12T23:14:05Z","timestamp":1358032445000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Improved monaural speech segregation based on computational auditory scene analysis"],"prefix":"10.1186","volume":"2013","author":[{"given":"Wang","family":"Yu","sequence":"first","affiliation":[]},{"given":"Lin","family":"Jiajun","sequence":"additional","affiliation":[]},{"given":"Chen","family":"Ning","sequence":"additional","affiliation":[]},{"given":"Yuan","family":"Wenhao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2013,1,12]]},"reference":[{"key":"68_CR1","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/1486.001.0001","volume-title":"Auditory Scene Analysis","author":"A Bregman","year":"1990","unstructured":"Bregman A: Auditory Scene Analysis. MIT Press, Cambridge, MA; 1990."},{"key":"68_CR2","doi-asserted-by":"publisher","DOI":"10.1109\/9780470043387","volume-title":"Computational Auditory Scene Analysis: Principles, Algorithms and Applications","author":"D Wang","year":"2006","unstructured":"Wang D, Brown G: Computational Auditory Scene Analysis: Principles, Algorithms and Applications. IEEE Press, New Jersey; 2006."},{"key":"68_CR3","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1006\/csla.1994.1016","volume":"8","author":"G Brown","year":"1994","unstructured":"Brown G, Cooke M: Computational auditory scene analysis. Comput Speech Lang 1994, 8: 297-336. 10.1006\/csla.1994.1016","journal-title":"Comput Speech Lang"},{"issue":"3","key":"68_CR4","doi-asserted-by":"publisher","first-page":"684","DOI":"10.1109\/72.761727","volume":"10","author":"D Wang","year":"1999","unstructured":"Wang D, Brown G: Separation of speech from interfering sounds based on oscillatory correlation. IEEE Trans. Neural Netw 1999, 10(3):684-697. 10.1109\/72.761727","journal-title":"IEEE Trans. Neural Netw"},{"issue":"5","key":"68_CR5","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1109\/TNN.2004.832812","volume":"15","author":"G Hu","year":"2004","unstructured":"Hu G, Wang D: Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Trans. Neural Netw 2004, 15(5):1135-1150. 10.1109\/TNN.2004.832812","journal-title":"IEEE Trans. Neural Netw"},{"key":"68_CR6","volume-title":"Topics in Acoustic Echo and Noise Control","year":"2006","unstructured":"Hansler E, Schmidt G (Eds): Topics in Acoustic Echo and Noise Control. Springer, New York; 2006."},{"issue":"8","key":"68_CR7","doi-asserted-by":"publisher","first-page":"2067","DOI":"10.1109\/TASL.2010.2041110","volume":"18","author":"G Hu","year":"2010","unstructured":"Hu G, Wang D: A tandem algorithm for pitch estimation and voiced speech segregation. IEEE Trans. Audio Speech Lang. Process 2010, 18(8):2067-2079.","journal-title":"IEEE Trans. Audio Speech Lang. Process"},{"issue":"2","key":"68_CR8","doi-asserted-by":"publisher","first-page":"396","DOI":"10.1109\/TASL.2006.881700","volume":"15","author":"G Hu","year":"2007","unstructured":"Hu G, Wang D: Auditory segmentation based on onset and offset analysis. IEEE Trans. Audio Speech Lang. Process 2007, 15(2):396-405.","journal-title":"IEEE Trans. Audio Speech Lang. Process"},{"key":"68_CR9","volume-title":"Monaural speech organization and segregation","author":"G Hu","year":"2006","unstructured":"Hu G: Monaural speech organization and segregation. The Ohio State University, PhD thesis; 2006."},{"key":"68_CR10","doi-asserted-by":"publisher","first-page":"1306","DOI":"10.1121\/1.2939132","volume":"124","author":"G Hu","year":"2008","unstructured":"Hu G, Wang D: Segregation of unvoiced speech from non-speech interference. J. Acoust. Soc. Am 2008, 124: 1306-1319. 10.1121\/1.2939132","journal-title":"J. Acoust. Soc. Am"},{"issue":"6","key":"68_CR11","doi-asserted-by":"publisher","first-page":"1600","DOI":"10.1109\/TASL.2010.2093893","volume":"19","author":"K Hu","year":"2011","unstructured":"Hu K, Wang D: Unvoiced speech segregation from nonspeech interference via CASA and spectral subtraction. IEEE Trans. Audio Speech Lang. Process 2011, 19(6):1600-1609.","journal-title":"IEEE Trans. Audio Speech Lang. Process"},{"key":"68_CR12","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1016\/j.csl.2008.03.004","volume":"24","author":"Y Shao","year":"2010","unstructured":"Shao Y, Srinivasan S, Jin Z, Wang D: A computational auditory scene analysis system for speech segregation and robust speech recognition. Comput. Speech Lang 2010, 24: 77-93. 10.1016\/j.csl.2008.03.004","journal-title":"Speech Lang"},{"issue":"3","key":"68_CR13","doi-asserted-by":"publisher","first-page":"1056","DOI":"10.1121\/1.396050","volume":"83","author":"R Meddis","year":"1988","unstructured":"Meddis R, et al.: Simulation of auditory-neural transduction: further studies. J. Acoust. Soc. Am 1988, 83(3):1056-1063. 10.1121\/1.396050","journal-title":"J. Acoust. Soc. Am"},{"key":"68_CR14","volume-title":"Tandem algorithm for pitch estimation and voiced speech segregation","author":"D Wang","year":"2010","unstructured":"Wang D: Tandem algorithm for pitch estimation and voiced speech segregation. 2010.http:\/\/www.cse.ohio-state.edu\/pnl\/software.html , Accessed 23 September 2012"},{"key":"68_CR15","volume-title":"Unvoiced speech segregation","author":"D Wang","year":"2006","unstructured":"Wang D, Hu G: Unvoiced speech segregation. IEEE, Toulouse; 2006."},{"issue":"1","key":"68_CR16","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1109\/TSA.2005.854106","volume":"14","author":"Y Shao","year":"2006","unstructured":"Shao Y, Wang D: Model-based sequential organization in cochannel speech. IEEE Trans. Audio Speech Lang. Process 2006, 14(1):289-298.","journal-title":"IEEE Trans. Audio Speech Lang. Process"},{"key":"68_CR17","volume-title":"Digital image processing using MATLAB (Publishing House of Electronics Industry","author":"C Rafael","year":"2009","unstructured":"Rafael C, Richard E, Steven L: Digital image processing using MATLAB (Publishing House of Electronics Industry. Beijing; 2009."},{"issue":"1","key":"68_CR18","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1109\/TCE.2009.4814427","volume":"55","author":"Y Lee","year":"2009","unstructured":"Lee Y, Kwon O: Application of shape analysis techniques for improved CASA-based speech separation. IEEE Trans. Consum. Electron 2009, 55(1):146-149.","journal-title":"IEEE Trans. Consum. Electron"},{"key":"68_CR19","volume-title":"Lecture Notes in Computer Science","author":"R Pichevar","year":"2004","unstructured":"Pichevar R, Rouat J, A quantitative evaluation of a bio-inspired sound segregation technique for two-and three-source mixtures sounds: Lecture Notes in Computer Science. Springer, Berlin; 2004."}],"container-title":["EURASIP Journal on Audio, Speech, and Music Processing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/1687-4722-2013-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/1687-4722-2013-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1687-4722-2013-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T22:41:24Z","timestamp":1630536084000},"score":1,"resource":{"primary":{"URL":"https:\/\/asmp-eurasipjournals.springeropen.com\/articles\/10.1186\/1687-4722-2013-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,1,12]]},"references-count":19,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,12]]}},"alternative-id":["68"],"URL":"https:\/\/doi.org\/10.1186\/1687-4722-2013-2","relation":{},"ISSN":["1687-4722"],"issn-type":[{"value":"1687-4722","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,1,12]]},"assertion":[{"value":"20 April 2012","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 November 2012","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 January 2013","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"2"}}