Abstract
Face super-resolution aims to recover high-resolution face images with accurate geometric structures. Most of the conventional super-resolution methods are trained on paired data that is difficult to obtain in the real-world setting. Besides, these methods do not fully utilize facial prior knowledge for face super-resolution. To tackle these problems, we propose an end-to-end unsupervised face super-resolution network to super-resolve low-resolution face images. We propose a gradient enhancement branch and a semantic guidance mechanism. Specifically, the gradient enhancement branch reconstructs high-resolution gradient maps, under the restriction of two proposed gradient losses. Then the super-resolution network integrates features in both image and gradient space to super-resolve face images with geometric structure preservation. Moreover, the proposed semantic guidance mechanism, including a semantic-adaptive sharpen module and a semantic-guided discriminator, can reconstruct sharp edges and improve local details in different facial regions adaptively, under the guidance of semantic parsing maps. Qualitative and quantitative experiments demonstrate that our proposed method can reconstruct high-resolution face images with sharp edges and photo-realistic details, outperforming the state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zhang, L., Zhang, H., Shen, H., Li, P.: A super-resolution reconstruction algorithm for surveillance images. Signal Process. 90(3), 848–859 (2010)
Nie, Yongwei, Xiao, C., Sun, H., Li, P.: Compact video synopsis via global spatiotemporal optimization. IEEE Trans. Visual. Comput. Graphics 19(10), 1664–1676 (2012)
Amaranageswarao, G., Deivalakshmi, S., Ko, S.-B.: Joint restoration convolutional neural network for low-quality image super resolution. Vis. Comput., pp. 1–20 (2020). https://doi.org/10.1007/s00371-020-01998-z
Zou, W.W.W.: Very low resolution face recognition problem. IEEE Trans. Image Process. 21(1), 327–340 (2011)
Wang, Z., Miao, Z., Wu, Q.M.J., Wan, Y., Tang, Z.: Low-resolution face recognition: a review. Vis. Comput. 30(4), 359–386 (2014)
Ledig, C., Theis, L., Huszar, F., Caballero, J., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., Change L. C., Esrgan: Enhanced super-resolution generative adversarial networks, In: Proceedings of the European Conference on Computer Vision, pp. 0–0 (2018)
Ma, C., Rao, Y., Cheng, Y., Chen, C., Lu, J., Zhou, J.: Structure-preserving super resolution with gradient guidance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7769–7778 (2020)
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Bing, X., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets, In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Proceedings of the European Conference on Computer Vision. Springer, pp. 694–711 (2016)
Yin, Y., Robinson, J., Zhang, Y., Fu, Y.: Joint super-resolution and alignment of tiny faces. Proc. AAAI Conf. Artif. Intell. 34, 12693–12700 (2020)
Chen, Y., Tai, Y., Liu, X., Shen, C., Yang, J.: Fsrnet: end-to-end learning face super-resolution with facial priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 2492–2501 (2018)
Zhao, T., Zhang, C.: Saan: semantic attention adaptation network for face super-resolution. In: 2020 IEEE International Conference on Multimedia and Expo. IEEE, pp. 1–6 (2020)
Yu, X., Fernando, B., Hartley, R., Porikli, F.: Super-resolving very low-resolution face images with supplementary attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 908–917 (2018)
Fritsche, M., Gu, S., Timofte, R.: Frequency separation for real-world super-resolution. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop. IEEE, pp. 3599–3608 (2019)
Zhou, Y., Deng, W., Tong, T., Gao, Q.: Guided frequency separation network for real-world super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 428–429 (2020)
Wen, Y., Sheng, B., Li, P., Lin, W., Feng, D.D.: Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution. IEEE Trans. Image Process. 28(2), 994–1006 (2019)
Huang, Y., Shao, L., Frangi, A. F.: Simultaneous super-resolution and cross-modality synthesis of 3d medical images using weakly-supervised joint convolutional sparse coding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6070–6079 (2017)
Keys, Robert: Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. Speech Signal Process. 29(6), 1153–1160 (1981)
Fattal, R.: Image upsampling via imposed edge statistics. In: ACM SIGGRAPH 2007 papers, pp. 95-es. (2007)
Freedman, Gilad, Fattal, R.: Image and video upscaling from local self-examples. ACM Trans. Graph. (TOG) 30(2), 1–11 (2011)
Xiong, Z., Sun, X., Feng, W.: Robust web image/video super-resolution. IEEE Trans. Image Process. 19(8), 2017–2028 (2010)
Zhang, H., Yang, J., Zhang, Y., Huang, T. S.: Non-local kernel regression for image and video restoration. In: European Conference on Computer Vision. Springer, pp. 566–579 (2010)
Freeman, William T., Jones, Thouis R., Pasztor, Egon C.: Example-based super-resolution. IEEE Comput. Graph. Appl. 22(2), 56–65 (2002)
Chang, H., Yeung, D-Y., Xiong, Y.: Super-resolution through neighbor embedding. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004. IEEE, vol. 1, pp. I–I (2004)
Dong, C., Loy, C. C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: European Conference on Computer Vision. Springer, pp. 184–199 (2014)
Sajjadi, M.S.M, Scholkopf, B., Hirsch, M.: Enhancenet: single image super-resolution through automated texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4491–4500 (2017)
Yuan, Y., Liu, S., Zhang, J., Zhang, Y., Dong, C., Lin, L.: Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 701–710 (2018)
Zhang, Y., Liu, S., Dong, C., Zhang, X., Yuan, Y.: Multiple cycle-in-cycle generative adversarial networks for unsupervised image super-resolution. IEEE Trans. Image Process. 29, 1101–1112 (2019)
Choudhury, A., Segall, A.: Channeling mr. potato head-face super-resolution using semantic components. In: Southwest Symposium on Image Analysis and Interpretation. IEEE 2014, 157–160 (2014)
Yu, X., Fernando, B., Ghanem, Bernard, P., Fatih, H., Richard: Face super-resolution guided by facial component heatmaps. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 217–233 (2018)
Bulat, A., Tzimiropoulos, G.: Super-fan: integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 109–117 (2018)
Xin, J., Wang, N., Gao, X., Li, J.: Residual attribute attention network for face image super-resolution. Proc. AAAI Conf. Artif. Intell. 33, 9054–9061 (2019)
Wang, C., Zhong, Z., Jiang, J., Zhai, D., Liu, X.: Parsing map guided multi-scale attention network for face hallucination. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 2518–2522 (2020)
Zhu, J.-Y., Park, T., Isola, P., Efros, A. A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 325–341 (2018)
Shocher, A., Cohen, N., Irani, M.: Zero-shot” super-resolution using deep internal learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3118–3126 (2018)
Cao, Gang, Zhao, Yao, Ni, Rongrong, Kot, Alex C.: Unsharp masking sharpening detection via overshoot artifacts analysis. IEEE Signal Process. Lett. 18(10), 603–606 (2011)
Peng, K.-S., Lin, F-C., Huang, Y-P., Shieh, H.-P.D.: Efficient super resolution using edge directed unsharp masking sharpening method. In: IEEE International Symposium on Multimedia. IEEE 2013, 508–509 (2013)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A. A: Image-to-image translation with conditional adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3118–3126 (2018)
Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks). In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1021–1030 (2017)
Jain, V., Learned-Miller, E.: Fddb: a benchmark for face detection in unconstrained settings. Tech. Rep, UMass Amherst technical report (2010)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
Li, L., Tang, J., Shao, Z., Tan, X., Ma, L.: Sketch-to-photo face generation based on semantic consistency preserving and similar connected component refinement. Vis. Comput., pp. 1–18, (2021). https://doi.org/10.1007/s00371-021-02188-1
Anokhin, I., Solovev, P., Korzhenkov, D., Kharlamov, A., Khakhulin, T., Silvestrov, A., Sergey, N., Victor, L., Gleb, S.: High-resolution daytime translation without domain labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7488–7497 (2020)
Damer, N., Boutros, F., Saladie, A. M., Kirchbuchner, F., Kuijper, A.: Realistic dreams: cascaded enhancement of gan-generated images with an example in face morphing attacks. In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, pp. 1–10 (2019)
Biswas, Soma, Aggarwal, Gaurav, Flynn, Patrick J., Bowyer, Kevin W.: Pose-robust recognition of low-resolution face images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 3037–3049 (2013)
Chen, J., Chen, J., Wang, Z., Liang, C., Lin, C.-W.: Identity-aware face super-resolution for low-resolution face recognition. IEEE Signal Process. Lett. 27, 645–649 (2020)
Hennings Y., Pablo H,. Baker, S., Vijaya, K.: BVK: simultaneous super-resolution and feature extraction for recognition of low-resolution faces. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp. 1–8 (2008)
Huang, G.B, Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database forstudying face recognition in unconstrained environments. In: Workshop on faces in’Real-Life’Images: detection, alignment, and recognition (2008)
Liu, W., Wen, Y., Yu, Z., Li, Ming, R., Bhiksha, S., Le: Sphereface: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 212–220 (2017)
Acknowledgements
This work is supported by the National Key Research and Development Program of China (No. 2019YFC1521104), National Natural Science Foundation of China (No. 61972157), the Economy and Informatization Commission of Shanghai Municipality (No. XX-RGZN-01-19-6348), and Fundamental Research Funds for the Central Universities (No. 2021QN1072).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, L., Tang, J., Ye, Z. et al. Unsupervised face super-resolution via gradient enhancement and semantic guidance. Vis Comput 37, 2855–2867 (2021). https://doi.org/10.1007/s00371-021-02236-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-021-02236-w