Generative Adversarial Network with Guided Generator for Non-stationary Noise Cancelation

Lim, Kyung-Hyun; Kim, Jin-Young; Cho, Sung-Bae

doi:10.1007/978-3-030-61705-9_1

Kyung-Hyun Lim¹²,
Jin-Young Kim¹² &
Sung-Bae Cho¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12344))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1149 Accesses

Abstract

Noise comes from a variety of sources in real world, which makes a lot of non-stationary noises, and it is difficult to find target speech from noisy auditory signals. Recently, adversarial learning models get attention for its high performance in the field of noise control, but it has limitation to depend on the one-to-one mapping between the noisy and the target signals, and unstable training process due to the various distributions of noise. In this paper, we propose a novel deep learning model to learn the noise and target speech distributions at the same time for improving the performance of noise cancellation. It is composed of two generators to stabilize the training process and two discriminators to optimize the distributions of noise and target speech, respectively. It helps to compress the distribution over the latent space, because two distributions from the same source are used simultaneously during adversarial learning. For the stable learning, one generator is pre-trained with minimum sample and guides the other generator, so that it can prevent mode collapsing problem by using prior knowledge. Experiments with the noise speech dataset composed of 30 speakers and 90 types of noise are conducted with scale-invariant source-to-noise ratio (SI-SNR) metric. The proposed model shows the enhanced performance of 7.36, which is 2.13 times better than the state-of-the-art model. Additional experiment on −10, −5, 0, 5, and 10 dB of the noise confirms the robustness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Non-stationary Noise Cancellation Using Deep Autoencoder Based on Adversarial Learning

Learning an Adversarial Network for Speech Enhancement Under Extremely Low Signal-to-Noise Ratio Condition

TPTGAN: Two-Path Transformer-Based Generative Adversarial Network Using Joint Magnitude Masking and Complex Spectral Mapping for Speech Enhancement

References

Sharma, M.K., Vig, R.: Ambulance siren noise reduction using LMS and FXLMS algorithms. Indian J. Sci. Technol. 9(47), 1–6 (2016)
Article Google Scholar
Cohen, I.: Multichannel Post-filtering in nonstationary noise environments. IEEE Trans. Signal Process. 52(5), 1149–1160 (2004)
Article MathSciNet Google Scholar
Pascual, S., Bonafonte, A., Serra, J.: SEGAN: speech enhancement generative adversarial network arXiv:1703.09452 (2017)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training Gans. In: Neural Information Processing Systems, pp. 2234–2242 (2016)
Google Scholar
Mahal, H.N., Mudge, P., Nandi, K.A.: Noise removal using adaptive filtering for ultrasonic guided wave testing of pipelines. In: Annual Conference of the British Institute of Non-Destructive Testing, pp. 19–27 (2019)
Google Scholar
Tamura, S., Waibel, A.: Noise reduction using connectionist models. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 553–556 (1988)
Google Scholar
Lin, J., et al.: Speech enhancement using forked generative adversarial networks with spectral subtraction. In: Interspeech, pp. 3163–3167 (2019)
Google Scholar
Lim, K.-H., Kim, J.-Y., Cho, S.-B.: Non-stationary noise cancellation using deep autoencoder based on adversarial learning. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A.J., Menezes, R., Allmendinger, R. (eds.) IDEAL 2019. LNCS, vol. 11871, pp. 367–374. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33607-3_40
Chapter Google Scholar
Nguyen, T., Le, T., Vu, H., Phung, D.: Dual discriminator generative adversarial nets. In: Neural Information Processing Systems, pp. 2670–2680 (2017)
Google Scholar
Hoang, Q., Nguyen, T.D., Le, T., Phung, D.: MGAN: training generative adversarial nets with multiple generators. In: International Conference on Learning Representation, pp. 1–24, 2018
Google Scholar
Kim, J.Y., Bu, S.J., Cho, S.B.: Hybrid deep learning based on GAN for classifying BSR noises from invehicle sensors. In: de Cos Juez, F., et al. (eds.) Hybrid Artificial Intelligent Systems, vol. 10870, pp. 27–38. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-92639-1_3
Chapter Google Scholar
Kim, J.Y., Bu, S.J., Cho, S.B.: Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf. Sci. 460–461, 83–102 (2018)
Article Google Scholar
Luke, M., Ben, P., David, P., Jascha, S.D.: Unrolled generative adversarial networks. arXiv:1611.02163 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Doersch, C.: Tutorial on variational autoencoders arXiv:1606.05908 (2016)
Valentini, C.: Noisy speech database for training speech enhancement algorithms and TTS Models. University of Edinburgh. School of Informatics. Centre for Speech Research (2016)
Google Scholar
Hu, G., Wang, D.L.: A tandem algorithm for pitch estimation and voiced speech segregation. IEEE Trans. Audio Speech Lang. Process. 18, 2067–2079 (2010)
Article Google Scholar
Luo, Y., Mesgarani, N.: TasNet: surpassing ideal time-frequency masking for speech separation arXiv:1809.07454 (2018)

Download references

Acknowledgement

This work was supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and grant funded by 2019 IT promotion fund (Development of AI based Precision Medicine Emergency System) of the Korean government (MSIT).

Author information

Authors and Affiliations

Department of Computer Science, Yonsei University, Seoul, 03722, South Korea
Kyung-Hyun Lim, Jin-Young Kim & Sung-Bae Cho

Authors

Kyung-Hyun Lim
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Young Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Bae Cho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sung-Bae Cho .

Editor information

Editors and Affiliations

University of Oviedo, Oviedo, Spain
Enrique Antonio de la Cal
University of Oviedo, Oviedo, Spain
José Ramón Villar Flecha
University of A Coruña, Ferrol, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lim, KH., Kim, JY., Cho, SB. (2020). Generative Adversarial Network with Guided Generator for Non-stationary Noise Cancelation. In: de la Cal, E.A., Villar Flecha, J.R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2020. Lecture Notes in Computer Science(), vol 12344. Springer, Cham. https://doi.org/10.1007/978-3-030-61705-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-61705-9_1
Published: 04 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61704-2
Online ISBN: 978-3-030-61705-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Generative Adversarial Network with Guided Generator for Non-stationary Noise Cancelation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Non-stationary Noise Cancellation Using Deep Autoencoder Based on Adversarial Learning

Learning an Adversarial Network for Speech Enhancement Under Extremely Low Signal-to-Noise Ratio Condition

TPTGAN: Two-Path Transformer-Based Generative Adversarial Network Using Joint Magnitude Masking and Complex Spectral Mapping for Speech Enhancement

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Generative Adversarial Network with Guided Generator for Non-stationary Noise Cancelation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Non-stationary Noise Cancellation Using Deep Autoencoder Based on Adversarial Learning

Learning an Adversarial Network for Speech Enhancement Under Extremely Low Signal-to-Noise Ratio Condition

TPTGAN: Two-Path Transformer-Based Generative Adversarial Network Using Joint Magnitude Masking and Complex Spectral Mapping for Speech Enhancement

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation