Image Manipulation Detection with Implicit Neural Representation and Limited Supervision

Zhang, Zhenfei; Li, Mingyang; Li, Xin; Chang, Ming-Ching; Hsieh, Jun-Wei

doi:10.1007/978-3-031-73223-2_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15146))

Included in the following conference series:

European Conference on Computer Vision

225 Accesses

Abstract

Image Manipulation Detection (IMD) is becoming increasingly important as tampering technologies advance. However, most state-of-the-art (SoTA) methods require high-quality training datasets featuring image- and pixel-level annotations. The effectiveness of these methods suffers when applied to manipulated or noisy samples that differ from the training data. To address these challenges, we present a unified framework that combines unsupervised and weakly supervised approaches for IMD. Our approach introduces a novel pre-processing stage based on a controllable fitting function from Implicit Neural Representation (INR). Additionally, we introduce a new selective pixel-level contrastive learning approach, which concentrates exclusively on high-confidence regions, thereby mitigating uncertainty associated with the absence of pixel-level labels. In weakly supervised mode, we utilize ground-truth image-level labels to guide predictions from an adaptive pooling method, facilitating comprehensive exploration of manipulation regions for image-level detection. The unsupervised model is trained using a self-distillation training method with selected high-confidence pseudo-labels obtained from the deepest layers via different sources. Extensive experiments demonstrate that our proposed method outperforms existing unsupervised and weakly supervised methods. Moreover, it competes effectively against fully supervised methods on novel manipulation detection tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8465; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bammey, Q., Gioi, R.G.V., Morel, J.M.: An adaptive neural network for unsupervised mosaic consistency analysis in image forensics. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14194–14204 (2020)
Google Scholar
Bi, X., Wei, Y., Xiao, B., Li, W.: RRU-Net: the ringed residual u-net for image splicing forgery detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Bondi, L., Lameri, S., Güera, D., Bestagini, P., Delp, E.J., Tubaro, S.: Tampering detection and localization through clustering of camera-based CNN features. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1855–1864. IEEE (2017)
Google Scholar
Chen, K., Hong, L., Xu, H., Li, Z., Yeung, D.Y.: Multisiam: self-supervised multi-instance Siamese representation learning for autonomous driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7546–7554 (2021)
Google Scholar
Chen, X., Dong, C., Ji, J., Cao, J., Li, X.: Image manipulation detection by multi-view multiscale supervision. In: IEEE/CVF International Conference on Computer Vision, pp. 14185–14193 (2021)
Google Scholar
Chen, Y., Liu, S., Wang, X.: Learning continuous image representation with local implicit image function. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8628–8638 (2021)
Google Scholar
Chen, Z., et al.: Videoinr: learning video implicit neural representation for continuous space-time super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2047–2057 (2022)
Google Scholar
Choi, C.H., Choi, J.H., Lee, H.K.: CFA pattern identification of digital cameras using intermediate value counting. In: Proceedings of the thirteenth ACM multimedia workshop on Multimedia and Security, pp. 21–26 (2011)
Google Scholar
Cozzolino, D., Verdoliva, L.: Noiseprint: a CNN based camera model fingerprint. IEEE Trans. Inf. Forensics Secur. 15, 144–159 (2019)
Article Google Scholar
Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. Adv. Neural. Inf. Process. Syst. 34, 8780–8794 (2021)
Google Scholar
Dong, J., Wang, W., Tan, T.: CASIA image tampering detection evaluation database (2010). http://forensics.idealtest.org
Dong, J., Wang, W., Tan, T.: CASIA image tampering detection evaluation database. In: 2013 IEEE China Summit and International Conference on Signal and Information Processing, pp. 422–426. IEEE (2013)
Google Scholar
Dupont, E., Goliński, A., Alizadeh, M., Teh, Y.W., Doucet, A.: Coin: compression with implicit neural representations. arXiv preprint arXiv:2103.03123 (2021)
Ergen, T., Kozat, S.S.: Unsupervised anomaly detection with LSTM neural networks. IEEE Trans. Neural Networks Learn. Syst. 31(8), 3127–3141 (2019)
Article MathSciNet Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
Google Scholar
Feng, Y., Feng, Y., You, H., Zhao, X., Gao, Y.: MeshNet: mesh neural network for 3D shape representation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8279–8286 (2019)
Google Scholar
Ferrara, P., Bianchi, T., De Rosa, A., Piva, A.: Image forgery localization via fine-grained analysis of CFA artifacts. IEEE Trans. Inf. Forensics Secur. 7(5), 1566–1577 (2012)
Article Google Scholar
Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 7(3), 868–882 (2012)
Article Google Scholar
Guan, H., et al.: MFC datasets: large-scale benchmark datasets for media forensic challenge evaluation. In: IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, pp. 63–72. IEEE (2019)
Google Scholar
Guillaro, F., Cozzolino, D., Sud, A., Dufour, N., Verdoliva, L.: Trufor: leveraging all-round clues for trustworthy image forgery detection and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20606–20615 (2023)
Google Scholar
Guo, X., Liu, X., Ren, Z., Grosz, S., Masi, I., Liu, X.: Hierarchical fine-grained image forgery detection and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3155–3165 (2023)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hsu, Y.F., Chang, S.F.: Detecting image splicing using geometry invariants and camera characteristics consistency. In: 2006 IEEE International Conference on Multimedia and Expo, pp. 549–552. IEEE (2006)
Google Scholar
Hu, X., Zhang, Z., Jiang, Z., Chaudhuri, S., Yang, Z., Nevatia, R.: SPAN: spatial pyramid attention network for image manipulation localization. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 312–328. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_19
Chapter Google Scholar
Ji, K., Chen, F., Guo, X., Xu, Y., Wang, J., Chen, J.: Uncertainty-guided learning for improving image manipulation detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22456–22465 (2023)
Google Scholar
Koch, G., Zemel, R., Salakhutdinov, R., et al.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2. Lille (2015)
Google Scholar
Kwan, H.M., Gao, G., Zhang, F., Gower, A., Bull, D.: Hinerv: video compression with hierarchical encoding-based neural representation. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Kwon, M.J., Nam, S.H., Yu, I.J., Lee, H.K., Kim, C.: Learning jpeg compression artifacts for image manipulation detection and localization. In: International Journal of Computer Vision, pp. 1875–1895 (2022)
Google Scholar
Li, J., Chen, Y., Xing, Y.: Memory mechanism for unsupervised anomaly detection. In: The 39th Conference on Uncertainty in Artificial Intelligence (2023)
Google Scholar
Li, S., Xia, X., Ge, S., Liu, T.: Selective-supervised contrastive learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 316–325 (2022)
Google Scholar
Liu, D., Yu, J.: Otsu method and k-means. In: 2009 Ninth International Conference on Hybrid Intelligent Systems, vol. 1, pp. 344–349. IEEE (2009)
Google Scholar
Liu, X., Liu, Y., Chen, J., Liu, X.: PSCC-Net: progressive Spatio-channel correlation network for image manipulation detection and localization. IEEE Trans. Circuits Syst. Video Technol. 32(11), 7505–7517 (2022)
Article MathSciNet Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Lu, X., Wang, W., Ma, C., Shen, J., Shao, L., Porikli, F.: See more, know more: Unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3623–3632 (2019)
Google Scholar
Lyu, S., Pan, X., Zhang, X.: Exposing region splicing forgeries with blind local noise estimation. Int. J. Comput. Vision 110, 202–221 (2014)
Article Google Scholar
Mahdian, B., Saic, S.: Using noise inconsistencies for blind image forensics. Image Vis. Comput. 27(10), 1497–1503 (2009)
Article Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Molaei, A., et al.: Implicit neural representation in medical imaging: a comparative survey. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2381–2391 (2023)
Google Scholar
Niu, Y., Tondi, B., Zhao, Y., Ni, R., Barni, M.: Image splicing detection, localization and attribution via jpeg primary quantization matrix estimation and clustering. IEEE Trans. Inf. Forensics Secur. 16, 5397–5412 (2021)
Article Google Scholar
Novozamsky, A., Mahdian, B., Saic, S.: Imd2020: a large-scale annotated dataset tailored for detecting manipulated images. In: IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, pp. 71–80 (2020)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article Google Scholar
Pan, X., Zhang, X., Lyu, S.: Exposing image forgery with blind noise estimation. In: Proceedings of the thirteenth ACM Multimedia Workshop on Multimedia and Security, pp. 15–20 (2011)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. arXiv preprint arXiv:1412.7144 (2014)
Pyatykh, S., Hesser, J., Zheng, L.: Image noise level estimation by principal component analysis. IEEE Trans. Image Process. 22(2), 687–699 (2012)
Article MathSciNet Google Scholar
Qian, Y., Hong, X., Guo, Z., Arandjelović, O., Donovan, C.R.: Semi-supervised crowd counting with contextual modeling: facilitating holistic understanding of crowd scenes. IEEE Trans. Circuits Syst. Video Technol. (2024)
Google Scholar
Qiao, T., Zhang, J., Xu, D., Tao, D.: MirrorGAN: learning text-to-image generation by redescription. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1505–1514 (2019)
Google Scholar
Radenović, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2018)
Article Google Scholar
Shi, J., Xu, N., Bui, T., Dernoncourt, F., Wen, Z., Xu, C.: A benchmark and baseline for language-driven image editing. In: Proceedings of the Asian Conference on Computer Vision (2020)
Google Scholar
Smucny, J., Shi, G., Lesh, T.A., Carter, C.S., Davidson, I.: Data augmentation with mixup: Enhancing performance of a functional neuroimaging-based prognostic deep learning classifier in recent onset psychosis. NeuroImage: Clinical 36, 103214 (2022)
Google Scholar
Tao, C., et al.: Siamese image modeling for self-supervised vision representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2132–2141 (2023)
Google Scholar
Wang, J., et al.: ObjectFormer for image manipulation detection and localization. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2364–2373 (2022)
Google Scholar
Wang, L., et al.: Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 136–145 (2017)
Google Scholar
Wei, Y., et al.: STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2314–2320 (2016)
Article Google Scholar
Wen, B., Zhu, Y., Subramanian, R., Ng, T.T., Shen, X., Winkler, S.: Coverage - a novel database for copy-move forgery detection. In: IEEE International Conference on Image Processing (ICIP) (2016)
Google Scholar
Wu, H., Chen, Y., Zhou, J.: Rethinking image forgery detection via contrastive learning and unsupervised clustering. arXiv preprint arXiv:2308.09307 (2023)
Wu, H., Zhou, J., Tian, J., Liu, J.: Robust image forgery detection over online social network shared images. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13440–13449 (2022)
Google Scholar
Wu, Y., AbdAlmageed, W., Natarajan, P.: Mantra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9543–9552 (2019)
Google Scholar
Xu, T., et al.: AttnGAN: fine-grained text to image generation with attentional generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1316–1324 (2018)
Google Scholar
Yang, C., Li, H., Lin, F., Jiang, B., Zhao, H.: Constrained R-CNN: a general image manipulation detection model. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2020)
Google Scholar
Yang, S., Ding, M., Wu, Y., Li, Z., Zhang, J.: Implicit neural representation for cooperative low-light image enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12918–12927 (2023)
Google Scholar
Yoon, J., Yu, S., Bansal, M.: Raccoon: remove, add, and change video content with auto-generated narratives. arXiv preprint arXiv:2405.18406 (2024)
Zhai, Y., Luan, T., Doermann, D., Yuan, J.: Towards generic image manipulation detection with weakly-supervised self-consistency learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22390–22400 (2023)
Google Scholar
Zhang, B., Tang, J., Niessner, M., Wonka, P.: 3dshape2vecset: a 3D shape representation for neural fields and generative diffusion models. arXiv preprint arXiv:2301.11445 (2023)
Zhang, H., et al.: Nerd: neural representation of distribution for medical image segmentation. arXiv preprint arXiv:2103.04020 (2021)
Zhang, K., Mo, L., Chen, W., Sun, H., Su, Y.: Magicbrush: a manually annotated dataset for instruction-guided image editing. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Zhang, L., Bao, C., Ma, K.: Self-distillation: towards efficient and compact neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 44(8), 4388–4403 (2021)
Google Scholar
Zhang, L., Rao, A., Agrawala, M.: Adding conditional control to text-to-image diffusion models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3836–3847 (2023)
Google Scholar
Zhang, W., Pang, J., Chen, K., Loy, C.C.: Dense Siamese network for dense unsupervised learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13690, pp. 464–480. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20056-4_27
Chapter Google Scholar
Zhang, Z., Bui, T.D.: Attention-based selection strategy for weakly supervised object localization. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 10305–10311. IEEE (2021)
Google Scholar
Zhang, Z., Chang, M.C.: Two-stage dual augmentation with clip for improved text-to-sketch synthesis. In: 2023 IEEE 6th International Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 1–6. IEEE (2023)
Google Scholar
Zhang, Z., Chang, M.C., Bui, T.D.: Improving class activation map for weakly supervised object localization. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2624–2628. IEEE (2022)
Google Scholar
Zhang, Z., Li, M., Chang, M.C.: A new benchmark and model for challenging image manipulation detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 7405–7413 (2024)
Google Scholar

Download references

Acknowledgements

This work is supported by the DARPA Semantic Forensics (SemaFor) Program under contract HR001120C0123 and NSF CCSS-2348046.

Author information

Authors and Affiliations

University at Albany, State University of New York, Albany, USA
Zhenfei Zhang, Xin Li & Ming-Ching Chang
Stanford University, Stanford, USA
Mingyang Li
National Yang Ming Chiao Tung University, Hsinchu, Taiwan
Jun-Wei Hsieh

Authors

Zhenfei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mingyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Xin Li
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Ching Chang
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Wei Hsieh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenfei Zhang .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Li, M., Li, X., Chang, MC., Hsieh, JW. (2025). Image Manipulation Detection with Implicit Neural Representation and Limited Supervision. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15146. Springer, Cham. https://doi.org/10.1007/978-3-031-73223-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-73223-2_15
Published: 08 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73222-5
Online ISBN: 978-3-031-73223-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Image Manipulation Detection with Implicit Neural Representation and Limited Supervision