{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,1,14]],"date-time":"2024-01-14T00:14:39Z","timestamp":1705191279846},"reference-count":52,"publisher":"Wiley","issue":"7","license":[{"start":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T00:00:00Z","timestamp":1698624000000},"content-version":"vor","delay-in-days":29,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100013314","name":"Higher Education Discipline Innovation Project","doi-asserted-by":"publisher","award":["D23006"],"id":[{"id":"10.13039\/501100013314","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Computer Graphics Forum"],"published-print":{"date-parts":[[2023,10]]},"abstract":"Abstract<\/jats:title>Face swapping is a technique that replaces a face in a target media with another face of a different identity from a source face image. Currently, research on the effective utilisation of prior knowledge and semantic guidance for photo\u2010realistic face swapping remains limited, despite the impressive synthesis quality achieved by recent generative models. In this paper, we propose a novel conditional Denoising Diffusion Probabilistic Model (DDPM) enforced by a two\u2010level face prior guidance. Specifically, it includes (i) an image\u2010level condition generated by a 3D Morphable Model (3DMM), and (ii) a high\u2010semantic level guidance driven by information extracted from several pre\u2010trained attribute classifiers, for high\u2010quality face image synthesis. Although swapped face image from 3DMM does not achieve photo\u2010realistic quality on its own, it provides a strong image\u2010level prior, in parallel with high\u2010level face semantics, to guide the DDPM for high fidelity image generation. The experimental results demonstrate that our method outperforms state\u2010of\u2010the\u2010art face swapping methods on benchmark datasets in terms of its synthesis quality, and capability to preserve the target face attributes and swap the source face identity.<\/jats:p>","DOI":"10.1111\/cgf.14949","type":"journal-article","created":{"date-parts":[[2023,10,31]],"date-time":"2023-10-31T06:16:09Z","timestamp":1698732969000},"update-policy":"http:\/\/dx.doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Semantics\u2010guided generative diffusion model with a 3DMM model condition for face swapping"],"prefix":"10.1111","volume":"42","author":[{"ORCID":"http:\/\/orcid.org\/0000-0003-2718-659X","authenticated-orcid":false,"given":"Xiyao","family":"Liu","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering Central South University Changsha 410083 China"}]},{"given":"Yang","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering Central South University Changsha 410083 China"}]},{"given":"Yuhao","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering Central South University Changsha 410083 China"}]},{"given":"Ting","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering Central South University Changsha 410083 China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-5418-0455","authenticated-orcid":false,"given":"Jian","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering Central South University Changsha 410083 China"}]},{"given":"Victoria","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Criminology and Criminal Justice, Faculty of Humanities and Social Sciences University of Portsmouth PO12HY U.K."}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-9365-7420","authenticated-orcid":false,"given":"Hui","family":"Fang","sequence":"additional","affiliation":[{"name":"Department of Computer Science Loughborough University Loughborough LE113TU U.K."}]}],"member":"311","published-online":{"date-parts":[[2023,10,30]]},"reference":[{"key":"e_1_2_8_2_2","doi-asserted-by":"crossref","unstructured":"AvrahamiO. LischinskiD. FriedO.: Blended diffusion for text-driven editing of natural images. InProceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition(2022) pp.18208\u201318218. 3","DOI":"10.1109\/CVPR52688.2022.01767"},{"key":"e_1_2_8_3_2","doi-asserted-by":"crossref","unstructured":"BaoJ. ChenD. WenF. LiH. HuaG.: Towards open-set identity preserving face synthesis. InProceedings of the IEEE conference on computer vision and pattern recognition(2018) pp.6713\u20136722. 2","DOI":"10.1109\/CVPR.2018.00702"},{"key":"e_1_2_8_4_2","doi-asserted-by":"crossref","unstructured":"BitoukD. KumarN. DhillonS. BelhumeurP. NayarS. K.: Face swapping: automatically replacing faces in photographs. InACM SIGGRAPH 2008 papers.2008 pp.1\u20138. 2","DOI":"10.1145\/1399504.1360638"},{"key":"e_1_2_8_5_2","unstructured":"BatzolisG. StanczukJ. Sch\u00f6nliebC.-B. EtmannC.: Conditional image generation with score-based diffusion models.arXiv preprint arXiv:2111.13606(2021). 2"},{"key":"e_1_2_8_6_2","first-page":"669","volume-title":"Computer Graphics Forum","author":"Blanz V.","year":"2004"},{"key":"e_1_2_8_7_2","doi-asserted-by":"crossref","unstructured":"BlanzV. VetterT.: A morphable model for the synthesis of 3d faces. InProceedings of the 26th annual conference on Computer graphics and interactive techniques(1999) pp.187\u2013194. 2","DOI":"10.1145\/311535.311556"},{"key":"e_1_2_8_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2003.1227983"},{"key":"e_1_2_8_9_2","doi-asserted-by":"crossref","unstructured":"ChenR. ChenX. NiB. GeY.: Simswap: An efficient framework for high fidelity face swapping. InProceedings of the 28th ACM International Conference on Multimedia(2020) pp.2003\u20132011. 2 7 8","DOI":"10.1145\/3394171.3413630"},{"key":"e_1_2_8_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2018.00020"},{"key":"e_1_2_8_11_2","doi-asserted-by":"crossref","unstructured":"ChengY.-T. TzengV. LiangY. WangC.-C. ChenB.-Y. ChuangY.-Y. OuhyoungM.: 3d-model-based face replacement in video. InSIGGRAPH'09: Posters.2009 pp.1\u20131. 2","DOI":"10.1145\/1599301.1599330"},{"key":"e_1_2_8_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2017.2765202"},{"key":"e_1_2_8_13_2","unstructured":"Deepfakes.Deepfakes.https:\/\/github.com\/deepfakes\/faceswap(2018-12). 7 8"},{"key":"e_1_2_8_14_2","doi-asserted-by":"crossref","unstructured":"DengJ. GuoJ. VerverasE. KotsiaI. ZafeiriouS.: Retinaface: Single-shot multi-level face localisation in the wild. InProceedings of the IEEE\/CVF conference on computer vision and pattern recognition(2020) pp.5203\u20135212. 5 6","DOI":"10.1109\/CVPR42600.2020.00525"},{"key":"e_1_2_8_15_2","doi-asserted-by":"crossref","unstructured":"DengJ. GuoJ. XueN. ZafeiriouS.: Arcface: Additive angular margin loss for deep face recognition. InProceedings of the IEEE\/CVF conference on computer vision and pattern recognition(2019) pp.4690\u20134699. 5","DOI":"10.1109\/CVPR.2019.00482"},{"key":"e_1_2_8_16_2","first-page":"8780","article-title":"Diffusion models beat gans on image synthesis","volume":"34","author":"Dhariwal P.","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_8_17_2","doi-asserted-by":"crossref","unstructured":"DengY. YangJ. XuS. ChenD. JiaY. TongX.: Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. InProceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops(2019) pp.0\u20130. 4","DOI":"10.1109\/CVPRW.2019.00038"},{"key":"e_1_2_8_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3395208"},{"key":"e_1_2_8_19_2","unstructured":"Faceswap.FaceSwap.https:\/\/github.com\/MarekKowalski\/FaceSwap. (2016-12). 7 8"},{"key":"e_1_2_8_20_2","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho J.","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_8_21_2","doi-asserted-by":"crossref","unstructured":"JiangL. LiR. WuW. QianC. LoyC. C.: Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection. InProceedings of the IEEE\/CVF conference on computer vision and pattern recognition(2020) pp.2889\u20132898. 2","DOI":"10.1109\/CVPR42600.2020.00296"},{"key":"e_1_2_8_22_2","unstructured":"KarrasT. AilaT. LaineS. LehtinenJ.: Progressive growing of gans for improved quality stability and variation.arXiv preprint arXiv:1710.10196(2017). 7"},{"key":"e_1_2_8_23_2","unstructured":"KingmaD. P. BaJ.: Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980(2014). 7"},{"key":"e_1_2_8_24_2","unstructured":"LiL. BaoJ. YangH. ChenD. WenF.: Faceshifter: Towards high fidelity and occlusion aware face swapping.arXiv preprint arXiv:1912.13457(2019). 2 8"},{"key":"e_1_2_8_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2023.109628"},{"key":"e_1_2_8_26_2","unstructured":"MengC. HeY. SongY. SongJ. WuJ. ZhuJ.-Y. ErmonS.: Sdedit: Guided image synthesis and editing with stochastic differential equations. InInternational Conference on Learning Representations(2021). 3"},{"key":"e_1_2_8_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3425780"},{"key":"e_1_2_8_28_2","first-page":"8162","volume-title":"International Conference on Machine Learning","author":"Nichol A. Q.","year":"2021"},{"key":"e_1_2_8_29_2","unstructured":"NicholA. DhariwalP. RameshA. ShyamP. MishkinP. McGrewB. SutskeverI. ChenM.: Glide: Towards photorealistic image generation and editing with text-guided diffusion models.arXiv preprint arXiv:2112.10741(2021). 3"},{"key":"e_1_2_8_30_2","article-title":"Multi-label co-regularization for semi-supervised facial action unit recognition","volume":"32","author":"Niu X.","year":"2019","journal-title":"Advances in neural information processing systems"},{"key":"e_1_2_8_31_2","doi-asserted-by":"crossref","unstructured":"NirkinY. KellerY. HassnerT.: Fsgan: Subject agnostic face swapping and reenactment. InProceedings of the IEEE\/CVF international conference on computer vision(2019) pp.7184\u20137193. 2","DOI":"10.1109\/ICCV.2019.00728"},{"key":"e_1_2_8_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2022.3155571"},{"key":"e_1_2_8_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2018.00024"},{"key":"e_1_2_8_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2022.103525"},{"key":"e_1_2_8_35_2","doi-asserted-by":"crossref","unstructured":"PreechakulK. ChattheeN. WizadwongsaS. SuwajanakornS.: Diffusion autoencoders: Toward a meaningful and decodable representation. InProceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition(2022) pp.10619\u201310629. 2","DOI":"10.1109\/CVPR52688.2022.01036"},{"key":"e_1_2_8_36_2","article-title":"Pytorch: An imperative style, high-performance deep learning library","volume":"32","author":"Paszke A.","year":"2019","journal-title":"Advances in neural information processing systems"},{"key":"e_1_2_8_37_2","doi-asserted-by":"crossref","unstructured":"P\u00e9rez-PelliteroE. SalvadorJ. Ruiz-HidalgoJ. RosenhahnB.: Psyco: Manifold span reduction for super resolution. InProceedings of the IEEE conference on computer vision and pattern recognition(2016) pp.1837\u20131845. 4","DOI":"10.1109\/CVPR.2016.203"},{"key":"e_1_2_8_38_2","doi-asserted-by":"crossref","unstructured":"RombachR. BlattmannA. LorenzD. EsserP. OmmerB.: High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition(2022) pp.10684\u201310695. 2 3","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_2_8_39_2","doi-asserted-by":"crossref","unstructured":"RuizN. ChongE. RehgJ. M.: Fine-grained head pose estimation without keypoints. InProceedings of the IEEE conference on computer vision and pattern recognition workshops(2018) pp.2074\u20132083. 6 8","DOI":"10.1109\/CVPRW.2018.00281"},{"key":"e_1_2_8_40_2","doi-asserted-by":"crossref","unstructured":"RosslerA. CozzolinoD. VerdolivaL. RiessC. ThiesJ. NiessnerM.: Faceforensics++: Learning to detect manipulated facial images. InProceedings of the IEEE\/CVF international conference on computer vision(2019) pp.1\u201311. 8","DOI":"10.1109\/ICCV.2019.00009"},{"key":"e_1_2_8_41_2","first-page":"36479","article-title":"Photorealistic text-to-image diffusion models with deep language understanding","volume":"35","author":"Saharia C.","year":"2022","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_8_42_2","first-page":"2256","volume-title":"International Conference on Machine Learning","author":"Sohl-Dickstein J.","year":"2015"},{"key":"e_1_2_8_43_2","doi-asserted-by":"crossref","unstructured":"SahariaC. HoJ. ChanW. SalimansT. FleetD. J. NorouziM.: Image super-resolution via iterative refinement.IEEE Transactions on Pattern Analysis and Machine Intelligence(2022). 3 5","DOI":"10.1109\/TPAMI.2022.3204461"},{"key":"e_1_2_8_44_2","doi-asserted-by":"crossref","unstructured":"ThiesJ. ZollhoferM. StammingerM. TheobaltC. NiessnerM.: Face2face: Real-time face capture and reenactment of rgb videos. InProceedings of the IEEE conference on computer vision and pattern recognition(2016) pp.2387\u20132395. 2","DOI":"10.1109\/CVPR.2016.262"},{"key":"e_1_2_8_45_2","doi-asserted-by":"crossref","unstructured":"VemulapalliR. AgarwalaA.: A compact embedding for facial expression similarity. InProceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition(2019) pp.5683\u20135692. 6 8","DOI":"10.1109\/CVPR.2019.00583"},{"key":"e_1_2_8_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2020.3002101"},{"key":"e_1_2_8_47_2","doi-asserted-by":"crossref","unstructured":"WangY. ChenX. ZhuJ. ChuW. TaiY. WangC. LiJ. WuY. HuangF. JiR.: Hififace: 3d shape and semantic prior guided high fidelity face swapping.arXiv preprint arXiv:2106.09965(2021). 2 7 8","DOI":"10.24963\/ijcai.2021\/157"},{"key":"e_1_2_8_48_2","doi-asserted-by":"crossref","unstructured":"WangH. WangY. ZhouZ. JiX. GongD. ZhouJ. LiZ. LiuW.: Cosface: Large margin cosine loss for deep face recognition. InProceedings of the IEEE conference on computer vision and pattern recognition(2018) pp.5265\u20135274. 8","DOI":"10.1109\/CVPR.2018.00552"},{"key":"e_1_2_8_49_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i4.16417"},{"key":"e_1_2_8_50_2","doi-asserted-by":"crossref","first-page":"8261","DOI":"10.1109\/ICASSP.2019.8683164","volume-title":"ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Yang X.","year":"2019"},{"key":"e_1_2_8_51_2","first-page":"21699","article-title":"Aot: Appearance optimal transport based identity swapping for forgery detection","volume":"33","author":"Zhu H.","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_8_52_2","doi-asserted-by":"crossref","unstructured":"ZhangZ. GeY. ChenR. TaiY. YanY. YangJ. WangC. LiJ. HuangF.: Learning to aggregate and personalize 3d face from in-the-wild photo collection. InProceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition(2021) pp.14214\u201314224. 8","DOI":"10.1109\/CVPR46437.2021.01399"},{"key":"e_1_2_8_53_2","doi-asserted-by":"crossref","unstructured":"ZhuY. LiQ. WangJ. XuC.-Z. SunZ.: One shot face swapping on megapixels. InProceedings of the IEEE\/CVF conference on computer vision and pattern recognition(2021) pp.4834\u20134844. 2","DOI":"10.1109\/CVPR46437.2021.00480"}],"container-title":["Computer Graphics Forum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1111\/cgf.14949","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,13]],"date-time":"2024-01-13T08:14:54Z","timestamp":1705133694000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1111\/cgf.14949"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10]]},"references-count":52,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2023,10]]}},"alternative-id":["10.1111\/cgf.14949"],"URL":"https:\/\/doi.org\/10.1111\/cgf.14949","archive":["Portico"],"relation":{},"ISSN":["0167-7055","1467-8659"],"issn-type":[{"value":"0167-7055","type":"print"},{"value":"1467-8659","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10]]},"assertion":[{"value":"2023-10-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}