Stable parallel training of Wasserstein conditional generative adversarial neural networks | The Journal of Supercomputing Skip to main content
Log in

Stable parallel training of Wasserstein conditional generative adversarial neural networks

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

We propose a stable, parallel approach to train Wasserstein conditional generative adversarial neural networks (W-CGANs) under the constraint of a fixed computational budget. Differently from previous distributed GANs training techniques, our approach avoids inter-process communications, reduces the risk of mode collapse and enhances scalability by using multiple generators, each one of them concurrently trained on a single data label. The use of the Wasserstein metric also reduces the risk of cycling by stabilizing the training of each generator. We illustrate the approach on the CIFAR10, CIFAR100, and ImageNet1k datasets, three standard benchmark image datasets, maintaining the original resolution of the images for each dataset. Performance is assessed in terms of scalability and final accuracy within a limited fixed computational time and computational resources. To measure accuracy, we use the inception score, the Fréchet inception distance, and image quality. An improvement in inception score and Fréchet inception distance is shown in comparison to previous results obtained by performing the parallel approach on deep convolutional conditional generative adversarial neural networks as well as an improvement of image quality of the new images created by the GANs approach. Weak scaling is attained on both datasets using up to 2000 NVIDIA V100 GPUs on the OLCF supercomputer Summit.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1

Image from https://sthalles.github.io/intro-to-gans/

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability Statement

Datasets CIFAR10 [15], CIFAR100 [16] and ImageNet1k [17] used during this study are open-source and can be accessed through the details provided by the respective items in the Reference list.

References

  1. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 27. Curran Associates, Inc., Palais des Congrés de Montréal, Montréal. https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf

  2. Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: Bengio Y, LeCun Y (eds) 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings, San Juan, Puerto Rico, May 2–4, 2016. http://arxiv.org/abs/1511.06434

  3. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems (NIPS'16), vol 29, Centre Convencions Internacional Barcelona, Barcelona, Spain. December 5-10, 2016. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/8a3363abe792db2d8761d6403605aeb7-Paper.pdf

  4. Bertsekas D (2021) Multiagent rollout algorithms and reinforcement learning. IEEE/CAA J. Autom. Sinica 8(2):249–272. https://doi.org/10.1109/JAS.2021.1003814https://www.ieee-jas.net/en/article/doi/10.1109/JAS.2021.1003814

  5. Mertikopoulos P, Papadimitriou C, Piliouras G (2018) Cycles in adversarial regularized learning. In: Proceedings of the 2018 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 2703–2717. https://doi.org/10.1137/1.9781611975031.172. https://epubs.siam.org/doi/abs/10.1137/1.9781611975031.172

  6. Hazan E, Singh K, Zhang C (2017) Learning linear dynamical systems via spectral filtering. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates, Inc., Long Beach Convention & Entertainment Center, Long Beach. https://proceedings.neurips.cc/paper/2017/file/165a59f7cf3b5c4396ba65953d679f17-Paper.pdf

  7. Lupo Pasini M, Gabbi V, Yin J, Perotto S, Laanait N (2021) Scalable balanced training of conditional generative adversarial neural networks on image data. J Supercomput. https://doi.org/10.1007/s11227-021-03808-2

    Article  Google Scholar 

  8. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv. https://doi.org/10.48550/ARXIV.1411.1784. arXiv:1411.1784

  9. Yang D, Hong S, Jang Y, Zhao T, Lee H (2019) Diversity-sensitive conditional generative adversarial networks. In: International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA

  10. Zhou P, Xie L, Zhang X, Ni B, Tian Q (2020) Searching towards class-aware generators for conditional generative adversarial networks. arXiv. https://doi.org/10.48550/ARXIV.2006.14208. arxiv:2006.14208

  11. Zhang H, Sindagi V, Pate VM (2020) Image de-raining using a conditional generative adversarial network. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2019.2920407

    Article  Google Scholar 

  12. Miyato T, Koyama M (2018) cGANs with projection discriminator. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net, Vancouver Convention Center, Vancouver, Canada. https://openreview.net/forum?id=ByS1VpgRZ

  13. Kavalerov I, Czaja W, Chellappa R (2019) cGANs with multi-hinge loss. arXiv. https://doi.org/10.48550/ARXIV.1912.04216. arXiv:1912.04216

  14. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17. Curran Associates Inc., Red Hook, pp 6629–6640

  15. Krizhevsky A, Nair V, Hinton G (2009) Cifar-10 (Canadian Institute for Advanced Research) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 21 Aug 2021

  16. Krizhevsky A, Nair V, Hinton G (2009) Cifar-100 (Canadian Institute for Advanced Research) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 21 Aug 2021

  17. ImageNet Large Scale Visual Recognition Challenge (ILSVRC). https://www.image-net.org/challenges/LSVRC/. Accessed 21 Aug 2021

  18. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86

    Article  MathSciNet  MATH  Google Scholar 

  19. Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145–151. https://doi.org/10.1109/18.61115

    Article  MathSciNet  MATH  Google Scholar 

  20. Belavkin RV (2018) Relation between the Kantorovich–Wasserstein metric and the Kullback–Leibler divergence. In: Ay N, Gibilisco P, Matúš F (eds) Information geometry and its applications. Springer, Cham, pp 363–373

    Chapter  MATH  Google Scholar 

  21. Scaman K, Virmaux A (2018) Lipschitz regularity of deep neural networks: analysis and efficient estimation. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18. Curran Associates Inc., Red Hook, pp 3839–3848

  22. Brock A, Donahue J, Simonyan K (2019) Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA. https://openreview.net/forum?id=B1xsqj09Fm. Accessed 27 July 2022

  23. Vlimant J-R, Pantaleo F, Pierini M, Loncar V, Vallecorsa S, Anderson D, Nguyen T, Zlokapa A (2019) Large-scale distributed training applied to generative adversarial networks for calorimeter simulation. In: EPJ Web of Conferences, vol 214, p 06025. https://doi.org/10.1051/epjconf/201921406025

  24. Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Ranzato MA, Senior A, Tucker P, Yang K, Le Q, Ng A (2012) Large scale distributed deep networks. In: Pereira F, Burges CJ, Bottou L, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 25. Curran Associates, Inc., Lake Tahoe. https://proceedings.neurips.cc/paper/2012/file/6aca97005c68f1206823815f66102863-Paper.pdf. Accessed 27 July 2022

  25. Liu M, Zhang W, Mroueh, Y, Cui X, Ross J, Yang T, Das P (2020) A decentralized parallel algorithm for training generative adversarial nets. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, Virtual. https://proceedings.neurips.cc/paper/2020/hash/7e0a0209b929d097bd3e8ef30567a5c1-Abstract.html. Accessed 27 July 2022

  26. Barratt S, Sharma R (2018) A note on the inception score. arXiv. https://doi.org/10.48550/ARXIV.1801.01973. arXiv:1801.01973

  27. Oak Ridge Leadership Facility (2018) Summit—Oak Ridge National Laboratory’s 200 petaflop supercomputer. https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/. Accessed 21 Aug 2021

  28. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, Alché-Buc F, Fox E, Garnett R (eds) Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates, Inc., Red Hook, pp 8026–8037

  29. Odena A, Olah C, Shlens J (2017) Conditional image synthesis with auxiliary classifier GANs. In: Precup D, Teh YW (eds) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 70. PMLR, International Convention Centre, Sydney, pp 2642–2651. http://proceedings.mlr.press/v70/odena17a.html. Accessed 27 July 2022

  30. Wang M, Li H, Li F (2017) Generative adversarial network based on Resnet for conditional image restoration. arXiv. https://doi.org/10.48550/ARXIV.1707.04881. arXiv:1707.04881

  31. Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 97. PMLR, Long Beach Convention & Entertainment Center, Long Beach, pp 7354–7363

Download references

Acknowledgements

Massimiliano Lupo Pasini thanks Dr. Vladimir Protopopescu for his valuable feedback in the preparation of this manuscript. This work was supported in part by the Office of Science of the Department of Energy and by the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory. This research is sponsored by the Artificial Intelligence Initiative as part of the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. This work used resources of the Oak Ridge Leadership Computing Facility, which is supported by the Office of Science of the U.S. Department of Energy under Contract no. DE-AC05-00OR22725.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimiliano Lupo Pasini.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lupo Pasini, M., Yin, J. Stable parallel training of Wasserstein conditional generative adversarial neural networks. J Supercomput 79, 1856–1876 (2023). https://doi.org/10.1007/s11227-022-04721-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04721-y

Keywords

Navigation