Abstract
To translate a real-world photograph into an artistic image in the style of a famous artist, the selection of colors and brushstrokes should reflect those of the artist. A one-to-one domain translation architecture, CycleGAN, trained with an unpaired dataset can be used to translate a real-world photograph into an artistic image. However, to translate images in N number of multi-artistic styles, the disadvantage is that more than one CycleGAN must be trained corresponding to each style. Here, we develop a single deep learning architecture that can be controlled to yield multiple artistic styles by adding a conditional vector. The overall architecture includes a one-to-N domain translation architecture, namely, a conditional CycleGAN, and an N-to-N domain translation architecture, namely, StarGAN, for translating into five different artistic styles. An evaluation of the trained models reveal that multiple artistic styles can be produced from a single real-world photograph only by adjusting the conditional input.






Similar content being viewed by others
References
Wu X, Sahoo D, Hoi SC. Recent advances in deep learning for object detection. Neurocomputing. 2020;396:39–64.
Druzhkov PN, Kustikova VD. A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit Image Anal. 2016;26(1):9–15.
Parkhi OM, Vedaldi A, Zisserman A. Deep face recognition (2015).
Hou X, Gong Y, Liu B, Sun K, Liu J, Xu B, Qiu G. Learning based image transformation using convolutional neural networks. IEEE Access. 2018;6:49779–92.
Komatsu R, Tad G. Comparing u-net based models for denoising color images. AI, MDPI. 2020;1(4):465–86.
Deshpande A, Rock J, Forsyth D. Learning large-scale automatic image colorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 567–575 (2015).
Monkumar A, Sannathamby L, Goyal SB. Unified framework of dense convolution neural network for image super resolution. Mater Today Proc. 2021. https://doi.org/10.1016/j.matpr.2021.06.109 (ISSN 2214–7853).
Cao S, An G, Zheng Z, Ruan Q. Interactions guided generative adversarial network for unsupervised image captioning. Neurocomputing. 2020;417:419–31. https://doi.org/10.1016/j.neucom.2020.08.019 (ISSN 0925–2312).
Sagar, Vishwakarma DK. A state-of-the-arts and prospective in neural style transfer. In: 2019 6th International Conference on Signal Processing and Integrated Networks (SPIN), pp. 244–247 (2019). https://doi.org/10.1109/SPIN.2019.8711612.
Liu Q, Zhang F, Lin M, Wang Y. Portrait style transfer with generative adversarial networks. In: Liu Q, Liu X, Li L, Zhou H, Zhao HH, editors. Proceedings of the 9th international conference on computer engineering and networks. Advances in intelligent systems and computing, vol. 1143. Singapore: Springer; 2021. https://doi.org/10.1007/978-981-15-3753-0_36.
Li S, Songzhi S, Lin J, Cai G, Sun L. Deep 3D caricature face generation with identity and structure consistency. Neurocomputing. 2021;454:178–88. https://doi.org/10.1016/j.neucom.2021.05.014 (ISSN 0925–2312).
Li B, Zhu Y, Wang Y, Lin CW, Ghanem B, Shen L. AniGAN: style-guided generative adversarial networks for unsupervised anime face generation (2021). arXiv preprint arXiv:2102.12593.
Souly N, Spampinato C, Shah M. Semi supervised semantic segmentation using generative adversarial network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5688–5696 (2017).
Isola P, Zhu J, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134 (2017).
Choi J, Kim T, Kim C. Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp. 6830–6840 (2019).
Cheng Z, Yang Q, Sheng B. Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 415–423 (2015).
Iizuka S, Simo-Serra E, Ishikawa H. Let there be color! joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans Graph (ToG). 2016;35(4):1–11.
Chen Y, Lai Y, Liu Y. CartoonGAN: generative adversarial networks for photo cartoonization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9465–9474 (2018).
Chen J, Liu G, Chen X. AnimeGAN: a novel lightweight GAN for photo animation, International Symposium on Intelligence Computation and Applications. Singapore: Springer; 2019. p. 242–56.
Kim J, Kim M, Kang H, Lee K. U-GAT-IT: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation (2019). arXiv preprint arXiv:1907.10830.
Zhu J, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232 (2017).
Mizura M, Osindero S. Conditional generative adversarial nets (2014). arXiv preprint arXiv:1411.1784.
Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas D. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 5907–5915 (2017).
Li X, Zhang Y, Zhang J, Chen Y, Li H, Marsic I, Burd RS. Region-based activity recognition using conditional GAN. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 1059–1067 (2017).
Nguyen V, Vicente TFY, Zhao M, Hoai M, Samaras D. Shadow detection with conditional generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4510–4518 (2017).
Choi Y, Choi M, Kim M, Ha J, Kim S, Choo J. StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789–8797 (2018).
Simonyan K, Zisserman A. Very deep convolution networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556.
He K, Zhang Z, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016).
LeCun Y, Haffner P, Bottou L, Bengio Y. Object recognition with gradient-based learning, shape, contour and grouping in computer vision. Berlin Heidelberg: Springer; 1999. p. 319–45.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Adv Neural Inform Process Syst. 2012;25:1097–105.
Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst. 2015;28:91–9.
Ying L, Dinghua S, Fuping W, Pang LK, Kiang CT, Yi L. Learning wavelet coefficients for face super-resolution. Visual Computer. 2020;37:1–10.
Zhang J, Wang C, Li C, Qin H. Example-based rapid generation of vegetation on terrain via CNN-based distribution learning. Vis Comput. 2019;35:1181–91.
Komatsu R, Gonsalves T. Conditional DCGAN's challenge: generating handwritten character digit, alphabet and katakana. In: Proceedings of the Annual Conference of JSAI 33rd Annual Conference, pp. 3B3E204–3B3E204. The Japanese Society for Artificial Intelligence (2019).
Gatys LA, Ecker AS, Bethge M. A neural algorithm of artistic style (2015). arXiv preprint arXiv:1508.06576.
Gatys LA, Ecker AS, Bethge M. Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2414–2423 (2016).
Wang L, Wang Z, Yang X, Hu SM, Zhang J. Photographic style transfer. Vis Comput. 2020;36:317–31.
Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution, European conference on computer vision. Cham: Springer; 2016. p. 694–711.
Yanai K. Unseen style transfer based on a conditional fast style transfer network. In: Workshop of International Conference on Learning Representations (2017).
Liu M, Breuel T, Kautz J. Unsupervised image-to-image translation networks. Adv Neural Inform Process Syst. 2017:700–708.
Chen R, Huang W, Huang B, Sun F, Fang B. Reusing discriminators for encoding: towards unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8168–8177 (2020).
Lu Y, Tai Y, Tang C. Attribute-guided face generation using conditional CycleGAN. In: Proceedings of the European conference on computer vision (ECCV), pp. 282–297 (2018).
Horita D, Tanno R, Shimoda W, Yanai K. Food category transfer with conditional CycleGAN and a large-scale food image dataset. In: Proceedings of the Joint Workshop on Multimedia for Cooking and Eating Activities and Multimedia Assisted Dietary Management, pp. 67–70 (2018).
Nie W, Narodytska N, Patel AB. RelGAN: relational generative adversarial networks for text generation. In: International conference on learning representations (2018).
Taigman Y, Polyak A, Wolf L. Unsupervised cross-domain image generation (2016). arXiv preprint arXiv:1611.02200.
Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp. 214–223 (2017).
Duck SK, Nichol K. Painter by Number. (2016) https://www.kaggle.com/c/painter-by-numbers. Accessed 28 Aug 2020.
Kingma DP, Ba JL. Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980.
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. GANs trained by a two time-scale update rule converge to a local nash equilibrium. Adv Neural Inform Process Syst. 2017;30:6626–37.
Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017).
Funding
This study has received no funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Visualized Results with Testing Images




Visualized Results with Author’s Photos




Rights and permissions
About this article
Cite this article
Komatsu, R., Gonsalves, T. Translation of Real-World Photographs into Artistic Images via Conditional CycleGAN and StarGAN. SN COMPUT. SCI. 2, 489 (2021). https://doi.org/10.1007/s42979-021-00884-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-00884-2