Stable parallel training of Wasserstein conditional generative adversarial neural networks

Lupo Pasini, Massimiliano; Yin, Junqi

doi:10.1007/s11227-022-04721-y

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Published: 03 August 2022

Volume 79, pages 1856–1876, (2023)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

242 Accesses
3 Citations
2 Altmetric
Explore all metrics

Abstract

We propose a stable, parallel approach to train Wasserstein conditional generative adversarial neural networks (W-CGANs) under the constraint of a fixed computational budget. Differently from previous distributed GANs training techniques, our approach avoids inter-process communications, reduces the risk of mode collapse and enhances scalability by using multiple generators, each one of them concurrently trained on a single data label. The use of the Wasserstein metric also reduces the risk of cycling by stabilizing the training of each generator. We illustrate the approach on the CIFAR10, CIFAR100, and ImageNet1k datasets, three standard benchmark image datasets, maintaining the original resolution of the images for each dataset. Performance is assessed in terms of scalability and final accuracy within a limited fixed computational time and computational resources. To measure accuracy, we use the inception score, the Fréchet inception distance, and image quality. An improvement in inception score and Fréchet inception distance is shown in comparison to previous results obtained by performing the parallel approach on deep convolutional conditional generative adversarial neural networks as well as an improvement of image quality of the new images created by the GANs approach. Weak scaling is attained on both datasets using up to 2000 NVIDIA V100 GPUs on the OLCF supercomputer Summit.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Scalable balanced training of conditional generative adversarial neural networks on image data

Article 26 April 2021

Multi-node Training for StyleGAN2

Evaluating POWER Architecture for Distributed Training of Generative Adversarial Networks

Data Availability Statement

Datasets CIFAR10 [15], CIFAR100 [16] and ImageNet1k [17] used during this study are open-source and can be accessed through the details provided by the respective items in the Reference list.

References

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 27. Curran Associates, Inc., Palais des Congrés de Montréal, Montréal. https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: Bengio Y, LeCun Y (eds) 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings, San Juan, Puerto Rico, May 2–4, 2016. http://arxiv.org/abs/1511.06434
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems (NIPS'16), vol 29, Centre Convencions Internacional Barcelona, Barcelona, Spain. December 5-10, 2016. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/8a3363abe792db2d8761d6403605aeb7-Paper.pdf
Bertsekas D (2021) Multiagent rollout algorithms and reinforcement learning. IEEE/CAA J. Autom. Sinica 8(2):249–272. https://doi.org/10.1109/JAS.2021.1003814 https://www.ieee-jas.net/en/article/doi/10.1109/JAS.2021.1003814
Mertikopoulos P, Papadimitriou C, Piliouras G (2018) Cycles in adversarial regularized learning. In: Proceedings of the 2018 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 2703–2717. https://doi.org/10.1137/1.9781611975031.172. https://epubs.siam.org/doi/abs/10.1137/1.9781611975031.172
Hazan E, Singh K, Zhang C (2017) Learning linear dynamical systems via spectral filtering. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates, Inc., Long Beach Convention & Entertainment Center, Long Beach. https://proceedings.neurips.cc/paper/2017/file/165a59f7cf3b5c4396ba65953d679f17-Paper.pdf
Lupo Pasini M, Gabbi V, Yin J, Perotto S, Laanait N (2021) Scalable balanced training of conditional generative adversarial neural networks on image data. J Supercomput. https://doi.org/10.1007/s11227-021-03808-2
Article Google Scholar
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv. https://doi.org/10.48550/ARXIV.1411.1784. arXiv:1411.1784
Yang D, Hong S, Jang Y, Zhao T, Lee H (2019) Diversity-sensitive conditional generative adversarial networks. In: International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA
Zhou P, Xie L, Zhang X, Ni B, Tian Q (2020) Searching towards class-aware generators for conditional generative adversarial networks. arXiv. https://doi.org/10.48550/ARXIV.2006.14208. arxiv:2006.14208
Zhang H, Sindagi V, Pate VM (2020) Image de-raining using a conditional generative adversarial network. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2019.2920407
Article Google Scholar
Miyato T, Koyama M (2018) cGANs with projection discriminator. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net, Vancouver Convention Center, Vancouver, Canada. https://openreview.net/forum?id=ByS1VpgRZ
Kavalerov I, Czaja W, Chellappa R (2019) cGANs with multi-hinge loss. arXiv. https://doi.org/10.48550/ARXIV.1912.04216. arXiv:1912.04216
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17. Curran Associates Inc., Red Hook, pp 6629–6640
Krizhevsky A, Nair V, Hinton G (2009) Cifar-10 (Canadian Institute for Advanced Research) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 21 Aug 2021
Krizhevsky A, Nair V, Hinton G (2009) Cifar-100 (Canadian Institute for Advanced Research) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 21 Aug 2021
ImageNet Large Scale Visual Recognition Challenge (ILSVRC). https://www.image-net.org/challenges/LSVRC/. Accessed 21 Aug 2021
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article MathSciNet MATH Google Scholar
Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145–151. https://doi.org/10.1109/18.61115
Article MathSciNet MATH Google Scholar
Belavkin RV (2018) Relation between the Kantorovich–Wasserstein metric and the Kullback–Leibler divergence. In: Ay N, Gibilisco P, Matúš F (eds) Information geometry and its applications. Springer, Cham, pp 363–373
Chapter MATH Google Scholar
Scaman K, Virmaux A (2018) Lipschitz regularity of deep neural networks: analysis and efficient estimation. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18. Curran Associates Inc., Red Hook, pp 3839–3848
Brock A, Donahue J, Simonyan K (2019) Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA. https://openreview.net/forum?id=B1xsqj09Fm. Accessed 27 July 2022
Vlimant J-R, Pantaleo F, Pierini M, Loncar V, Vallecorsa S, Anderson D, Nguyen T, Zlokapa A (2019) Large-scale distributed training applied to generative adversarial networks for calorimeter simulation. In: EPJ Web of Conferences, vol 214, p 06025. https://doi.org/10.1051/epjconf/201921406025
Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Ranzato MA, Senior A, Tucker P, Yang K, Le Q, Ng A (2012) Large scale distributed deep networks. In: Pereira F, Burges CJ, Bottou L, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 25. Curran Associates, Inc., Lake Tahoe. https://proceedings.neurips.cc/paper/2012/file/6aca97005c68f1206823815f66102863-Paper.pdf. Accessed 27 July 2022
Liu M, Zhang W, Mroueh, Y, Cui X, Ross J, Yang T, Das P (2020) A decentralized parallel algorithm for training generative adversarial nets. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, Virtual. https://proceedings.neurips.cc/paper/2020/hash/7e0a0209b929d097bd3e8ef30567a5c1-Abstract.html. Accessed 27 July 2022
Barratt S, Sharma R (2018) A note on the inception score. arXiv. https://doi.org/10.48550/ARXIV.1801.01973. arXiv:1801.01973
Oak Ridge Leadership Facility (2018) Summit—Oak Ridge National Laboratory’s 200 petaflop supercomputer. https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/. Accessed 21 Aug 2021
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, Alché-Buc F, Fox E, Garnett R (eds) Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates, Inc., Red Hook, pp 8026–8037
Odena A, Olah C, Shlens J (2017) Conditional image synthesis with auxiliary classifier GANs. In: Precup D, Teh YW (eds) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 70. PMLR, International Convention Centre, Sydney, pp 2642–2651. http://proceedings.mlr.press/v70/odena17a.html. Accessed 27 July 2022
Wang M, Li H, Li F (2017) Generative adversarial network based on Resnet for conditional image restoration. arXiv. https://doi.org/10.48550/ARXIV.1707.04881. arXiv:1707.04881
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 97. PMLR, Long Beach Convention & Entertainment Center, Long Beach, pp 7354–7363

Download references

Acknowledgements

Massimiliano Lupo Pasini thanks Dr. Vladimir Protopopescu for his valuable feedback in the preparation of this manuscript. This work was supported in part by the Office of Science of the Department of Energy and by the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory. This research is sponsored by the Artificial Intelligence Initiative as part of the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. This work used resources of the Oak Ridge Leadership Computing Facility, which is supported by the Office of Science of the U.S. Department of Energy under Contract no. DE-AC05-00OR22725.

Author information

Authors and Affiliations

Computational Sciences and Engineering Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN, 37831, USA
Massimiliano Lupo Pasini
National Center for Computational Sciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN, 37831, USA
Junqi Yin

Authors

Massimiliano Lupo Pasini
View author publications
You can also search for this author in PubMed Google Scholar
Junqi Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Massimiliano Lupo Pasini.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lupo Pasini, M., Yin, J. Stable parallel training of Wasserstein conditional generative adversarial neural networks. J Supercomput 79, 1856–1876 (2023). https://doi.org/10.1007/s11227-022-04721-y

Download citation

Accepted: 12 July 2022
Published: 03 August 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11227-022-04721-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scalable balanced training of conditional generative adversarial neural networks on image data

Multi-node Training for StyleGAN2

Evaluating POWER Architecture for Distributed Training of Generative Adversarial Networks

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scalable balanced training of conditional generative adversarial neural networks on image data

Multi-node Training for StyleGAN2

Evaluating POWER Architecture for Distributed Training of Generative Adversarial Networks

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation