Abstract
Recent generative adversarial networks (GANs) can synthesize high-fidelity faces and the closely followed works show the existence of facial semantic field in the latent spaces. This motivates several latest works to edit faces via finding semantic directions in the universal facial semantic field of GAN to walk along. However, several challenges still exist during editing: identity loss, attribute entanglement and background variation. In this work, we first propose a personalized facial semantic field (PFSF) for each instead of a universal facial semantic field for all instances. The PFSF is built via portrait-masked retraining of the generator of StyleGAN together with the inversion model, which can preserve identity details for real faces. Furthermore, we propose an individual walk in the learned PFSF to perform disentangled face editing. Finally, the edited portrait is fused back into the original image with the constraint of the portrait mask, which can preserve the background. Extensive experimental results validate that our method performs well in identity preservation, background maintenance and disentangled editing, significantly surpassing related state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zhuang, P., Koyejo, O., Schwing, A.G.: Enjoy your editing: controllable GANs for image editing via latent space navigation. In: international conference on learning representations (2021)
Kemelmacher-Shlizerman, I., Suwajanakorn, S., Seitz, S.M.: Illumination-aware age progression. In: conference on computer vision and pattern recognition. p. 3334–3341 (2014)
Egger, B., Smith, W.A.P., Tewari, A., Wuhrer, S., Zollhöfer, M., Beeler, T., et al.: 3D Morphable Face Models - Past, Present, and Future. ACM Trans. Graph. 39(5), 1–38 (2020)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Waggenspack, W.N., (ed.), proceedings of annual conference on computer graphics and interactive techniques p. 187–194 (1999)
Choi, Y., Choi, M., Kim, M., Ha, J., Kim, S., Choo, J.: StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In: IEEE conference on computer vision and pattern recognition p. 8789–8797 (2018)
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: Facial Attribute Editing by Only Changing What You Want. IEEE Transct. Img. Process. 28(11), 5464–5478 (2019)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: progressive growing of GANs for improved quality, stability, and variation. In: international conference on learning representations (2018)
Brock, A., Donahue, J., Simonyan, K.: Large Scale GAN training for high fidelity natural image synthesis. In: international conference on learning representations (2018)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: IEEE conference on computer vision and pattern recognition. p. 4401–4410 (2019)
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: analyzing and improving the image quality of StyleGAN. In: IEEE conference on computer vision and pattern recognition. p. 8107–8116 (2020)
Shen, Y., Yang, C., Tang, X., Zhou, B.: InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs. IEEE Trans. Pattern. Anal. Mach. Intell. 44(4), 2004–2018 (2022)
Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANSpace: Discovering interpretable GAN controls. In: annual conference on neural information processing systems (2020)
Wang, H., Yu, N., Fritz, M.: Hijack-GAN: unintended-use of pretrained, black-box GANs. In: IEEE conference on computer vision and pattern recognition, p. 7872–7881 (2021)
Li, M., Jin, Y., Zhu, H.: Surrogate gradient field for latent space manipulation. In: IEEE conference on computer vision and pattern recognition. p. 6529–6538 (2021)
Viazovetskyi, Y., Ivashkin, V., Kashin, E.: StyleGAN2 Distillation for feed-forward image manipulation. In: computer vision in european conference. vol. 12367, p. 170–186 (2020)
Yang, G., Fei, N., Ding, M., Liu, G., Lu, Z., Xiang, T.: L2M-GAN: Learning to manipulate latent space semantics for facial attribute editing. In: IEEE conference on computer vision and pattern recognition. p. 2951–2960 (2021)
Ju, Y., Zhang, J., Mao, X., Xu, J.: Adaptive semantic attribute decoupling for precise face image editing. Vis Comput. 37(9–11), 2907–2918 (2021)
Han, Y., Yang, J., Fu, Y.: Disentangled face attribute editing via instance-aware latent space search. In: Proceedings of the thirtieth international joint Conference on artificial intelligence. p. 715–721 (2021)
Yao, X., Newson, A., Gousseau, Y., Hellier, P.: A latent transformer for disentangled face editing in images and videos. In: IEEE international conference on computer vision. p. 13789–13798 (2021)
Abdal, R., Qin, Y., Wonka, P.: Image2Style: How to embed images into the StyleGAN latent space? In: IEEE international conference on computer vision. p. 4431–4440 (2019)
Creswell, A., Bharath, A.A.: Inverting the generator of a generative adversarial network. IEEE Trans. Neural Netw. Learn Syst. 30(7), 1967–1974 (2019)
Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN++: How to edit the embedded images? In: IEEE conference on computer vision and pattern recognition. p. 8293–8302 (2020)
Ma, F., Ayaz, U., Karaman, S.: Invertibility of convolutional generative networks from partial measurements. In: annual conference on neural information processing systems. p. 9651–9660 (2018)
Lipton, Z.C., Tripathi, S.: Precise recovery of latent vectors from generative adversarial networks. In: international conference on learning representations (2017)
Gu, J., Shen, Y., Zhou, B.: Image processing using multi-Code GAN prior. In: IEEE conference on computer vision and pattern recognition. p. 3009–3018 (2020)
Zhu, J., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: European conference on computer vision. vol. 9909, 597–613 (2016)
Bau, D., Zhu, J.Y., Wulff, J., Peebles, W., Strobelt, H., Zhou, B., et al.: Inverting layers of a large generator. In: ICLR workshop. vol. 2, p. 4 (2019)
Perarnau, G., van de Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible Conditional GANs for image editing (2016). arXiv preprint arXiv:1611.06355
Tewari, A., Elgharib, M., Bharaj, G., Bernard, F., Seidel, H., Pérez, P., et al.: StyleRig: Rigging StyleGAN for 3D control over portrait images. In: IEEE conference on computer vision and pattern recognition. p. 6141–6150 (2020)
Xu, Y., Shen, Y., Zhu, J., Yang, C., Zhou, B.: Generative hierarchical features from synthesizing Images. In: IEEE conference on computer vision and pattern recognition. p. 4432–4442 (2021)
Richardson, E., Alaluf, Y., Patashnik, O., Nitzan, Y., Azar, Y., Shapiro, S., et al.: Encoding in style: a StyleGAN encoder for image-to-image translation. In: IEEE conference on computer vision and pattern recognition. p. 2287–2296 (2021)
Zhu, J., Shen, Y., Zhao, D., Zhou, B.: In-domain GAN inversion for real image editing. In: European conference on computer vision. vol. 12362. p. 592–608 (2020)
Bau, D., Zhu, J., Wulff, J., Peebles, W.S., Zhou, B., Strobelt, H., et al.: seeing What a GAN cannot generate. In: IEEE international conference on computer vision. p. 4501–4510 (2019)
Guan, S., Tai, Y., Ni, B., Zhu, F., Huang, F., Yang, X.: Collaborative learning for Faster StyleGAN embedding. (2020). arXiv preprint arXiv:2007.01758
Yang, N., Zhou, M., Xia, B., Guo, X., Qi, L.: Inversion based on a detached dual-channel domain method for styleGAN2 embedding. IEEE Signal Process Lett. 28, 553–557 (2021)
Lin, C., Xiong, S.: Controllable face editing for video reconstruction in human digital twins. Img. Vision Comput. 125, 104517 (2022)
Lin, C., Xiong, S., Chen, Y.: Mutual information maximizing GAN inversion for real face with identity preservation. J. Visual Communicat. Image Represent. 87, 103566 (2022)
Wang, S., Zou, Y., Min, W., Wu, J., Xiong, X.: Multi-view face generation via unpaired images. Vis Comput. 38(7), 2539–2554 (2022)
Li, J., Ma, S., Zhang, J., Tao, D.: Privacy-preserving portrait matting. In: ACM multimedia conference, Virtual Event. p. 3501–3509 (2021)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-Resolution. In: European conference on computer vision. vol. 9906; p. 694–711 (2016)
Wang, R., Chen, J., Yu, G., Sun, L., Yu, C., Gao, C., et al.: Attribute-specific Control Units in StyleGAN for Fine-grained image manipulation. In: ACM multimedia conference. p. 926–934 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition. p. 770–778 (2016)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning Face attributes in the Wild. In: IEEE international conference on computer vision. p. 3730–3738 (2015)
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: Additive angular margin loss for deep face recognition. In: IEEE conference on computer vision and pattern recognition. p. 4690–4699 (2019)
Song, Y., He, F., Duan, Y., Liang, Y., Yan, X.: A kernel correlation-based approach to adaptively acquire local features for learning 3D point clouds. Comput. Aided Des. 146, 103196 (2022)
Xu, H., He, F., Fan, L., Bai, J.: D3AdvM: a direct 3D adversarial sample attack inside mesh data. Comput. Aid. Geometric Design. 97, 102122 (2022)
Liang, Y., He, F., Zeng, X., Luo, J.: An improved loop subdivision to coordinate the smoothness and the number of faces via multi-objective optimization. Integr. Comput. Aided Eng. 29(1), 23–41 (2022)
Fang, Z., Liu, Z., Liu, T., Hung, C., Xiao, J., Feng, G.: Facial expression GAN for voice-driven face generation. Vis Comput. 38(3), 1151–1164 (2022)
Huang, X., Wang, M., Gong, M.: Fine-grained talking face generation with video reinterpretation. Vis Comput. 37(1), 95–105 (2021)
Acknowledgements
This work was in part supported by NSFC (Grant No. 62176194, Grant No. 62101393), the Major project of IoV (Grant No. 2020AAA001), Sanya Science and Education Innovation Park of Wuhan University of Technology (Grant No. 2021KF0031), CSTC(Grant No. cstc2021jcyj-msxmX1148) and the Open Project of Wuhan University of Technology Chongqing Research Institute (ZL2021-6).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, C., Xiong, S. & Lu, X. Disentangled face editing via individual walk in personalized facial semantic field. Vis Comput 39, 6005–6014 (2023). https://doi.org/10.1007/s00371-022-02708-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02708-7