Translation of Real-World Photographs into Artistic Images via Conditional CycleGAN and StarGAN | SN Computer Science Skip to main content
Log in

Translation of Real-World Photographs into Artistic Images via Conditional CycleGAN and StarGAN

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

To translate a real-world photograph into an artistic image in the style of a famous artist, the selection of colors and brushstrokes should reflect those of the artist. A one-to-one domain translation architecture, CycleGAN, trained with an unpaired dataset can be used to translate a real-world photograph into an artistic image. However, to translate images in N number of multi-artistic styles, the disadvantage is that more than one CycleGAN must be trained corresponding to each style. Here, we develop a single deep learning architecture that can be controlled to yield multiple artistic styles by adding a conditional vector. The overall architecture includes a one-to-N domain translation architecture, namely, a conditional CycleGAN, and an N-to-N domain translation architecture, namely, StarGAN, for translating into five different artistic styles. An evaluation of the trained models reveal that multiple artistic styles can be produced from a single real-world photograph only by adjusting the conditional input.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Wu X, Sahoo D, Hoi SC. Recent advances in deep learning for object detection. Neurocomputing. 2020;396:39–64.

    Article  Google Scholar 

  2. Druzhkov PN, Kustikova VD. A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit Image Anal. 2016;26(1):9–15.

    Article  Google Scholar 

  3. Parkhi OM, Vedaldi A, Zisserman A. Deep face recognition (2015).

  4. Hou X, Gong Y, Liu B, Sun K, Liu J, Xu B, Qiu G. Learning based image transformation using convolutional neural networks. IEEE Access. 2018;6:49779–92.

    Article  Google Scholar 

  5. Komatsu R, Tad G. Comparing u-net based models for denoising color images. AI, MDPI. 2020;1(4):465–86.

    Google Scholar 

  6. Deshpande A, Rock J, Forsyth D. Learning large-scale automatic image colorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 567–575 (2015).

  7. Monkumar A, Sannathamby L, Goyal SB. Unified framework of dense convolution neural network for image super resolution. Mater Today Proc. 2021. https://doi.org/10.1016/j.matpr.2021.06.109 (ISSN 2214–7853).

    Article  Google Scholar 

  8. Cao S, An G, Zheng Z, Ruan Q. Interactions guided generative adversarial network for unsupervised image captioning. Neurocomputing. 2020;417:419–31. https://doi.org/10.1016/j.neucom.2020.08.019 (ISSN 0925–2312).

    Article  Google Scholar 

  9. Sagar, Vishwakarma DK. A state-of-the-arts and prospective in neural style transfer. In: 2019 6th International Conference on Signal Processing and Integrated Networks (SPIN), pp. 244–247 (2019). https://doi.org/10.1109/SPIN.2019.8711612.

  10. Liu Q, Zhang F, Lin M, Wang Y. Portrait style transfer with generative adversarial networks. In: Liu Q, Liu X, Li L, Zhou H, Zhao HH, editors. Proceedings of the 9th international conference on computer engineering and networks. Advances in intelligent systems and computing, vol. 1143. Singapore: Springer; 2021. https://doi.org/10.1007/978-981-15-3753-0_36.

    Chapter  Google Scholar 

  11. Li S, Songzhi S, Lin J, Cai G, Sun L. Deep 3D caricature face generation with identity and structure consistency. Neurocomputing. 2021;454:178–88. https://doi.org/10.1016/j.neucom.2021.05.014 (ISSN 0925–2312).

    Article  Google Scholar 

  12. Li B, Zhu Y, Wang Y, Lin CW, Ghanem B, Shen L. AniGAN: style-guided generative adversarial networks for unsupervised anime face generation (2021). arXiv preprint arXiv:2102.12593.

  13. Souly N, Spampinato C, Shah M. Semi supervised semantic segmentation using generative adversarial network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5688–5696 (2017).

  14. Isola P, Zhu J, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134 (2017).

  15. Choi J, Kim T, Kim C. Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp. 6830–6840 (2019).

  16. Cheng Z, Yang Q, Sheng B. Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 415–423 (2015).

  17. Iizuka S, Simo-Serra E, Ishikawa H. Let there be color! joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans Graph (ToG). 2016;35(4):1–11.

    Article  Google Scholar 

  18. Chen Y, Lai Y, Liu Y. CartoonGAN: generative adversarial networks for photo cartoonization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9465–9474 (2018).

  19. Chen J, Liu G, Chen X. AnimeGAN: a novel lightweight GAN for photo animation, International Symposium on Intelligence Computation and Applications. Singapore: Springer; 2019. p. 242–56.

    Google Scholar 

  20. Kim J, Kim M, Kang H, Lee K. U-GAT-IT: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation (2019). arXiv preprint arXiv:1907.10830.

  21. Zhu J, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232 (2017).

  22. Mizura M, Osindero S. Conditional generative adversarial nets (2014). arXiv preprint arXiv:1411.1784.

  23. Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas D. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 5907–5915 (2017).

  24. Li X, Zhang Y, Zhang J, Chen Y, Li H, Marsic I, Burd RS. Region-based activity recognition using conditional GAN. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 1059–1067 (2017).

  25. Nguyen V, Vicente TFY, Zhao M, Hoai M, Samaras D. Shadow detection with conditional generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4510–4518 (2017).

  26. Choi Y, Choi M, Kim M, Ha J, Kim S, Choo J. StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789–8797 (2018).

  27. Simonyan K, Zisserman A. Very deep convolution networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556.

  28. He K, Zhang Z, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016).

  29. LeCun Y, Haffner P, Bottou L, Bengio Y. Object recognition with gradient-based learning, shape, contour and grouping in computer vision. Berlin Heidelberg: Springer; 1999. p. 319–45.

    Book  Google Scholar 

  30. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Adv Neural Inform Process Syst. 2012;25:1097–105.

    Google Scholar 

  31. Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst. 2015;28:91–9.

    Google Scholar 

  32. Ying L, Dinghua S, Fuping W, Pang LK, Kiang CT, Yi L. Learning wavelet coefficients for face super-resolution. Visual Computer. 2020;37:1–10.

    Google Scholar 

  33. Zhang J, Wang C, Li C, Qin H. Example-based rapid generation of vegetation on terrain via CNN-based distribution learning. Vis Comput. 2019;35:1181–91.

    Article  Google Scholar 

  34. Komatsu R, Gonsalves T. Conditional DCGAN's challenge: generating handwritten character digit, alphabet and katakana. In: Proceedings of the Annual Conference of JSAI 33rd Annual Conference, pp. 3B3E204–3B3E204. The Japanese Society for Artificial Intelligence (2019).

  35. Gatys LA, Ecker AS, Bethge M. A neural algorithm of artistic style (2015). arXiv preprint arXiv:1508.06576.

  36. Gatys LA, Ecker AS, Bethge M. Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2414–2423 (2016).

  37. Wang L, Wang Z, Yang X, Hu SM, Zhang J. Photographic style transfer. Vis Comput. 2020;36:317–31.

    Article  Google Scholar 

  38. Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution, European conference on computer vision. Cham: Springer; 2016. p. 694–711.

    Google Scholar 

  39. Yanai K. Unseen style transfer based on a conditional fast style transfer network. In: Workshop of International Conference on Learning Representations (2017).

  40. Liu M, Breuel T, Kautz J. Unsupervised image-to-image translation networks. Adv Neural Inform Process Syst. 2017:700–708.

  41. Chen R, Huang W, Huang B, Sun F, Fang B. Reusing discriminators for encoding: towards unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8168–8177 (2020).

  42. Lu Y, Tai Y, Tang C. Attribute-guided face generation using conditional CycleGAN. In: Proceedings of the European conference on computer vision (ECCV), pp. 282–297 (2018).

  43. Horita D, Tanno R, Shimoda W, Yanai K. Food category transfer with conditional CycleGAN and a large-scale food image dataset. In: Proceedings of the Joint Workshop on Multimedia for Cooking and Eating Activities and Multimedia Assisted Dietary Management, pp. 67–70 (2018).

  44. Nie W, Narodytska N, Patel AB. RelGAN: relational generative adversarial networks for text generation. In: International conference on learning representations (2018).

  45. Taigman Y, Polyak A, Wolf L. Unsupervised cross-domain image generation (2016). arXiv preprint arXiv:1611.02200.

  46. Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp. 214–223 (2017).

  47. Duck SK, Nichol K. Painter by Number. (2016) https://www.kaggle.com/c/painter-by-numbers. Accessed 28 Aug 2020.

  48. Kingma DP, Ba JL. Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980.

  49. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. GANs trained by a two time-scale update rule converge to a local nash equilibrium. Adv Neural Inform Process Syst. 2017;30:6626–37.

    Google Scholar 

  50. Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017).

Download references

Funding

This study has received no funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tad Gonsalves.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Visualized Results with Testing Images

figure c
figure d
figure e
figure f

Visualized Results with Author’s Photos

figure g
figure h
figure i
figure j

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Komatsu, R., Gonsalves, T. Translation of Real-World Photographs into Artistic Images via Conditional CycleGAN and StarGAN. SN COMPUT. SCI. 2, 489 (2021). https://doi.org/10.1007/s42979-021-00884-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00884-2

Keywords

Navigation