Abstract
Though there are many encouraging reports, existing image colorization algorithms are still prone to unnatural visual distortions. We observe that unnatural visual distortions are mainly introduced in the deconvolutional modules of existing generative models. Furthermore, the existing algorithms are with heavily structures, which hinders the deployment of algorithms on edge devices. In this paper, we propose ISP-GAN, a novel lightweight generative adversarial network with inception sub-pixel deconvolution aimed at improving the performance of image colorization. In the generator of our proposed ISP-GAN, we propose a novel inception sub-pixel deconvolutional block (ISP), along with a modified residual convolutional block (MRC), to avoid representational bottlenecks and consequently expand perceptual fields. For the ISP-GAN discriminator, we apply deep-learning-based steganalytic networks to improve the training efficiency of the whole framework and consequently enhance the performance of the corresponding generator. Our ISP-GAN is with lightweight structures and experimental results on the benchmark datasets show that ISP-GAN can achieve state-of-the-art performance in the image colorization task.
Similar content being viewed by others
Notes
The specific network parameters in each generation task can be found in the supplementary materials.
References
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223
Baldassarre F, Morín DG, Rodés-Guirao L (2017) Deep koalarization: image colorization using CNNs and Inception-Resnet-V2. arXiv:1712.03400
Chan KC, Wang X, Xu X, Gu J, Loy CC (2021) Glean: generative latent bank for large-factor image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14245–14254
Charpiat G, Hofmann M, Schölkopf B (2008) Automatic image colorization via multimodal predictions. In: European conference on computer vision, pp 126–139. Springer
Cheng Z, Yang Q, Sheng B (2015) Deep colorization. In: Proceedings of the IEEE international conference on computer vision, pp 415–423
Cherepkov A, Voynov A, Babenko A (2021) Navigating the gan parameter space for semantic image editing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3671–3680
Chia AYS, Zhuo S, Gupta RK, Tai YW, Cho SY, Tan P, Lin S (2011) Semantic colorization with internet images. ACM Trans Graph (TOG) 30(6):1–8
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
Fridrich J, Kodovsky J (2012) Rich models for steganalysis of digital images. IEEE Trans Inform Forens Secur 7(3):868–882
Gatys LA, Ecker AS, Bethge M (2015) A neural algorithm of artistic style. arXiv:1508.06576
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Guo Y, Cao X, Zhang W, Wang R (2018) Fake colorized image detection. IEEE Trans Inform Forens Secur 13(8):1932–1944
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Horé A, Ziou D (2010) Image quality metrics: Psnr vs. ssim. In: 2010 20th International conference on pattern recognition, pp 2366–2369
Huang YC, Tung YS, Chen JC, Wang SW, Wu JL (2005) An adaptive edge detection based colorization algorithm and its applications. In: Proceedings of the 13th annual ACM international conference on multimedia, pp 351–354
Iizuka S, Simo-Serra E, Ishikawa H (2016) Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans Graph (TOG) 35(4):110
ImageNet http://image-net.org/
Ironi R, Cohen-Or D, Lischinski D (2005) Colorization by example. In: Rendering Techniques, pp 201–210. Citeseer
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of GANs for improved quality, stability, and variation. arXiv:1710.10196
Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European conference on computer vision, pp 577–593. Springer
Levin A, Lischinski D, Weiss Y (2004) Colorization using optimization. In: ACM SIGGRAPH 2004 Papers, pp 689–694
Li C, Wand M (2016) Combining markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2479–2486
Liang W, Ding D, Wei G (2021) An improved dualgan for near-infrared image colorization. Infrared Phys Technol 116:103764
Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2794–2802
Morimoto Y, Taguchi Y, Naemura T (2009) Automatic colorization of grayscale images using multiple images on the web. In: SIGGRAPH 2009: talks, pp 1–1
Odena A, Dumoulin V, Olah C (2016) Deconvolution and checkerboard artifacts. Distill 1(10):e3
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer
Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv:1609.04747
Shi W, Caballero J, Huszár F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1874–1883
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tritrong N, Rewatbowornwong P, Suwajanakorn S (2021) Repurposing gans for one-shot semantic part segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4475–4485
Wan S, Xia Y, Qi L, Yang YH, Atiquzzaman M (2020) Automated colorization of a grayscale image with seed points propagation. IEEE Trans Multimed 22(7):1756–1768
Welsh T, Ashikhmin M, Mueller K (2002) Transferring color to greyscale images. In: Proceedings of the 29th annual conference on computer graphics and interactive techniques, pp 277–280
Zeng J, Tan S, Liu G, Li B, Huang J (2019) Wisernet: wider separate-then-reunion network for steganalysis of color images. IEEE Transactions on Information Forensics and Security
Zhang K, Zuo W, Zhang L (2018) FFDNet: toward a fast and flexible solution for CNN-based image denoising. IEEE Trans Image Process 27(9):4608–4622
Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision, pp 649–666. Springer
Zhao J, Mathieu M, LeCun Y (2016) Energy-based generative adversarial network. arXiv:1609.03126
Acknowledgements
This work was supported in part by NSFC (U19B2022, 61772349, 61872244, 62072313, 61806131, 61802262), Guangdong Basic and Applied Basic Research Foundation (2019B151502001), and Shenzhen R&D Program (JCYJ20200109105008228, 20200813110043002). This work was also supported in part by Alibaba Group through Alibaba Innovative Research (AIR) Program.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhuo, L., Tan, S., Li, B. et al. ISP-GAN: inception sub-pixel deconvolution-based lightweight GANs for colorization. Multimed Tools Appl 81, 24977–24994 (2022). https://doi.org/10.1007/s11042-022-12587-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12587-8