Abstract
Image editing is a widely studied topic in computer vision, which enables the modification of specific attributes in images without altering other crucial information. One popular unsupervised technique currently used is feature decomposition in the latent space of Generative Adversarial Networks (GANs), which provides editing directions that can control attribute changes to achieve desired image editing results. However, this method often does not allow for the direct acquisition of the desired editing direction by setting the target attribute in advance. In this work, we propose a method to finding editing directions in the attribute space by analyzing image differences. This enables users to obtain target directions by actively defining the attribute they want to change. Specifically, this method discovers semantic directions suitable for target attribute editing by applying Principal Component Analysis (PCA) on the difference of image latent codes embedded in the latent space. Through experiments, our method can effectively find the target editing direction according to user needs and achieve satisfactory editing effects at the same time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of styleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5549–5558 (2020)
Tan, Z., Ye, Z., Yang, X., Wang, Q., Yan, Y., Huang, K.: Towards better text-image consistency in text-to-image generation. arXiv preprint arXiv:2210.15235 (2022)
Shen, Y., Gu, J., Tang, X., Zhou, B.: Interpreting the latent space of GANs for semantic face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9243–9252 (2020)
Goetschalckx, L., Andonian, A., Oliva, A., Isola, P.: GANalyze: toward visual definitions of cognitive image properties. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5744–5753 (2019)
Shen, Y., Zhou, B.: Closed-form factorization of latent semantics in GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1532–1540 (2021)
Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANspace: discovering interpretable GAN controls. Adv. Neural. Inf. Process. Syst. 33, 9841–9850 (2020)
Noble, W.S.: What is a support vector machine? Nat. Biotechnol. 24(12), 1565–1567 (2006)
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Syst. 2(1–3), 37–52 (1987)
Dinh, T.M., Tran, A.T., Nguyen, R., Hua, B.S.: Hyperinverter: improving styleGAN inversion via hypernetwork. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11389–11398 (2022)
Alaluf, Y., Tov, O., Mokady, R., Gal, R., Bermano, A.: HyperStyle: StyleGAN inversion with hypernetworks for real image editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18511–18521 (2022)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)
Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., Xiao, J.: LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554–561 (2013)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)
Acknowledgements
This work was supported by the Qingdao Natural Science Foundation (No. 23-2-1-161-zyyd-jch), the Shandong Natural Science Foundation (No. ZR2023MF008, No. ZR2023QF046), the Major Scientific and Technological Projects of CNPC (No. ZD2019-183-008) and the National Natural Science Foundation of China (No. 61671480).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Y., Yang, S., Liu, B., Liu, W. (2023). Attribute Space Analysis for Image Editing. In: Lu, H., et al. Image and Graphics. ICIG 2023. Lecture Notes in Computer Science, vol 14356. Springer, Cham. https://doi.org/10.1007/978-3-031-46308-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-46308-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46307-5
Online ISBN: 978-3-031-46308-2
eBook Packages: Computer ScienceComputer Science (R0)