Abstract
We propose a stable, parallel approach to train Wasserstein conditional generative adversarial neural networks (W-CGANs) under the constraint of a fixed computational budget. Differently from previous distributed GANs training techniques, our approach avoids inter-process communications, reduces the risk of mode collapse and enhances scalability by using multiple generators, each one of them concurrently trained on a single data label. The use of the Wasserstein metric also reduces the risk of cycling by stabilizing the training of each generator. We illustrate the approach on the CIFAR10, CIFAR100, and ImageNet1k datasets, three standard benchmark image datasets, maintaining the original resolution of the images for each dataset. Performance is assessed in terms of scalability and final accuracy within a limited fixed computational time and computational resources. To measure accuracy, we use the inception score, the Fréchet inception distance, and image quality. An improvement in inception score and Fréchet inception distance is shown in comparison to previous results obtained by performing the parallel approach on deep convolutional conditional generative adversarial neural networks as well as an improvement of image quality of the new images created by the GANs approach. Weak scaling is attained on both datasets using up to 2000 NVIDIA V100 GPUs on the OLCF supercomputer Summit.
Similar content being viewed by others
References
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 27. Curran Associates, Inc., Palais des Congrés de Montréal, Montréal. https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: Bengio Y, LeCun Y (eds) 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings, San Juan, Puerto Rico, May 2–4, 2016. http://arxiv.org/abs/1511.06434
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems (NIPS'16), vol 29, Centre Convencions Internacional Barcelona, Barcelona, Spain. December 5-10, 2016. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/8a3363abe792db2d8761d6403605aeb7-Paper.pdf
Bertsekas D (2021) Multiagent rollout algorithms and reinforcement learning. IEEE/CAA J. Autom. Sinica 8(2):249–272. https://doi.org/10.1109/JAS.2021.1003814https://www.ieee-jas.net/en/article/doi/10.1109/JAS.2021.1003814
Mertikopoulos P, Papadimitriou C, Piliouras G (2018) Cycles in adversarial regularized learning. In: Proceedings of the 2018 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 2703–2717. https://doi.org/10.1137/1.9781611975031.172. https://epubs.siam.org/doi/abs/10.1137/1.9781611975031.172
Hazan E, Singh K, Zhang C (2017) Learning linear dynamical systems via spectral filtering. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates, Inc., Long Beach Convention & Entertainment Center, Long Beach. https://proceedings.neurips.cc/paper/2017/file/165a59f7cf3b5c4396ba65953d679f17-Paper.pdf
Lupo Pasini M, Gabbi V, Yin J, Perotto S, Laanait N (2021) Scalable balanced training of conditional generative adversarial neural networks on image data. J Supercomput. https://doi.org/10.1007/s11227-021-03808-2
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv. https://doi.org/10.48550/ARXIV.1411.1784. arXiv:1411.1784
Yang D, Hong S, Jang Y, Zhao T, Lee H (2019) Diversity-sensitive conditional generative adversarial networks. In: International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA
Zhou P, Xie L, Zhang X, Ni B, Tian Q (2020) Searching towards class-aware generators for conditional generative adversarial networks. arXiv. https://doi.org/10.48550/ARXIV.2006.14208. arxiv:2006.14208
Zhang H, Sindagi V, Pate VM (2020) Image de-raining using a conditional generative adversarial network. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2019.2920407
Miyato T, Koyama M (2018) cGANs with projection discriminator. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net, Vancouver Convention Center, Vancouver, Canada. https://openreview.net/forum?id=ByS1VpgRZ
Kavalerov I, Czaja W, Chellappa R (2019) cGANs with multi-hinge loss. arXiv. https://doi.org/10.48550/ARXIV.1912.04216. arXiv:1912.04216
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17. Curran Associates Inc., Red Hook, pp 6629–6640
Krizhevsky A, Nair V, Hinton G (2009) Cifar-10 (Canadian Institute for Advanced Research) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 21 Aug 2021
Krizhevsky A, Nair V, Hinton G (2009) Cifar-100 (Canadian Institute for Advanced Research) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 21 Aug 2021
ImageNet Large Scale Visual Recognition Challenge (ILSVRC). https://www.image-net.org/challenges/LSVRC/. Accessed 21 Aug 2021
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145–151. https://doi.org/10.1109/18.61115
Belavkin RV (2018) Relation between the Kantorovich–Wasserstein metric and the Kullback–Leibler divergence. In: Ay N, Gibilisco P, Matúš F (eds) Information geometry and its applications. Springer, Cham, pp 363–373
Scaman K, Virmaux A (2018) Lipschitz regularity of deep neural networks: analysis and efficient estimation. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18. Curran Associates Inc., Red Hook, pp 3839–3848
Brock A, Donahue J, Simonyan K (2019) Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA. https://openreview.net/forum?id=B1xsqj09Fm. Accessed 27 July 2022
Vlimant J-R, Pantaleo F, Pierini M, Loncar V, Vallecorsa S, Anderson D, Nguyen T, Zlokapa A (2019) Large-scale distributed training applied to generative adversarial networks for calorimeter simulation. In: EPJ Web of Conferences, vol 214, p 06025. https://doi.org/10.1051/epjconf/201921406025
Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Ranzato MA, Senior A, Tucker P, Yang K, Le Q, Ng A (2012) Large scale distributed deep networks. In: Pereira F, Burges CJ, Bottou L, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 25. Curran Associates, Inc., Lake Tahoe. https://proceedings.neurips.cc/paper/2012/file/6aca97005c68f1206823815f66102863-Paper.pdf. Accessed 27 July 2022
Liu M, Zhang W, Mroueh, Y, Cui X, Ross J, Yang T, Das P (2020) A decentralized parallel algorithm for training generative adversarial nets. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, Virtual. https://proceedings.neurips.cc/paper/2020/hash/7e0a0209b929d097bd3e8ef30567a5c1-Abstract.html. Accessed 27 July 2022
Barratt S, Sharma R (2018) A note on the inception score. arXiv. https://doi.org/10.48550/ARXIV.1801.01973. arXiv:1801.01973
Oak Ridge Leadership Facility (2018) Summit—Oak Ridge National Laboratory’s 200 petaflop supercomputer. https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/. Accessed 21 Aug 2021
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, Alché-Buc F, Fox E, Garnett R (eds) Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates, Inc., Red Hook, pp 8026–8037
Odena A, Olah C, Shlens J (2017) Conditional image synthesis with auxiliary classifier GANs. In: Precup D, Teh YW (eds) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 70. PMLR, International Convention Centre, Sydney, pp 2642–2651. http://proceedings.mlr.press/v70/odena17a.html. Accessed 27 July 2022
Wang M, Li H, Li F (2017) Generative adversarial network based on Resnet for conditional image restoration. arXiv. https://doi.org/10.48550/ARXIV.1707.04881. arXiv:1707.04881
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 97. PMLR, Long Beach Convention & Entertainment Center, Long Beach, pp 7354–7363
Acknowledgements
Massimiliano Lupo Pasini thanks Dr. Vladimir Protopopescu for his valuable feedback in the preparation of this manuscript. This work was supported in part by the Office of Science of the Department of Energy and by the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory. This research is sponsored by the Artificial Intelligence Initiative as part of the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. This work used resources of the Oak Ridge Leadership Computing Facility, which is supported by the Office of Science of the U.S. Department of Energy under Contract no. DE-AC05-00OR22725.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lupo Pasini, M., Yin, J. Stable parallel training of Wasserstein conditional generative adversarial neural networks. J Supercomput 79, 1856–1876 (2023). https://doi.org/10.1007/s11227-022-04721-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04721-y