Abstract
Public datasets play a crucial role in advancing data-centric AI, yet they remain vulnerable to illicit uses. This paper presents ‘undercover bias,’ a novel dataset watermarking method that can reliably identify and verify unauthorized data usage. Our approach is inspired by an observation that trained models often inadvertently learn biased knowledge and can function on bias-only data, even without any information directly related to a target task. Leveraging this, we deliberately embed class-wise hidden bias via unnoticeable watermarks, which are unrelated to the target dataset but share the same labels. Consequently, a model trained on this watermarked data covertly learns to classify these watermarks. The model’s performance in classifying the watermarks serves as irrefutable evidence of unauthorized usage, which cannot be achieved by chance. Our approach presents multiple benefits: 1) stealthy and model-agnostic watermarks; 2) minimal impact on the target task; 3) irrefutable evidence of misuse; and 4) improved applicability in practical scenarios. We validate these benefits through extensive experiments and extend our method to fine-grained classification and image segmentation tasks. Our implementation is available at here (https://github.com/jjh6297/UndercoverBias).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
https://image-net.org/challenges/LSVRC/announcement-June-2-2015 , June 2015
https://www.kaggle.com/c/petfinder-adoption-prediction/discussion/125436, January 2020
Aghakhani, H., Meng, D., Wang, Y.X., Kruegel, C., Vigna, G.: Bullseye polytope: a scalable clean-label poisoning attack with improved transferability. In: EuroS &P, pp. 159–178 (2021)
Baluja, S.: Hiding images in plain sight: deep steganography. In: NeurIPS, pp. 2066–2076 (2017)
Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. ICCV 88(2), 303–338 (2010)
Geiping, J., et al.: Witches’ brew: industrial scale data poisoning via gradient matching. In: ICLR (2021)
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: NeurIPS (2013)
Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: BadNets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019)
Heo, B., Yun, S., Han, D., Chun, S., Choe, J., Oh, S.J.: Rethinking spatial dimensions of vision transformers. In: ICCV (2021)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017)
Huang, W.R., Geiping, J., Fowl, L., Taylor, G., Goldstein, T.: MetaPoison: practical general-purpose clean-label data poisoning. In: NeurIPS, vol. 33, pp. 12080–12091 (2020)
Jiang, W., Li, H., Xu, G., Zhang, T.: Color backdoor: a robust poisoning attack in color space. In: CVPR, pp. 8133–8142 (2023)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Le, Y., Yang, X.: Tiny ImageNet visual recognition challenge. CS 231N(7), 7 (2015)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Lee, J., Kim, E., Lee, J., Lee, J., Choo, J.: Learning debiased representation via disentangled feature augmentation. In: NeurIPS, vol. 34, pp. 25123–25133 (2021)
Li, Y., Bai, Y., Jiang, Y., Yang, Y., Xia, S.T., Li, B.: Untargeted backdoor watermark: towards harmless and stealthy dataset copyright protection. In: NeurIPS (2022)
Li, Y., Zhang, Z., Bai, J., Wu, B., Jiang, Y., Xia, S.T.: Open-sourced dataset protection via backdoor watermarking. In: NeurIPS Workshops (2020)
Li, Y., Li, Y., Wu, B., Li, L., He, R., Lyu, S.: Invisible backdoor attack with sample-specific triggers. In: ICCV, pp. 16463–16472 (2021)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, G., Xu, T., Ma, X., Wang, C.: Your model trains on my data? Protecting intellectual property of training data via membership fingerprint authentication. IEEE Trans. Inf. Forensics Secur. 17, 1024–1037 (2022)
Liu, Y., Ma, X., Bailey, J., Lu, F.: Reflection backdoor: a natural backdoor attack on deep neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 182–199. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_11
Nam, J., Cha, H., Ahn, S., Lee, J., Shin, J.: Learning from failure: de-biasing classifier from biased classifier. In: NeurIPS, vol. 33, pp. 20673–20684 (2020)
Ramaswamy, V.V., Kim, S.S., Russakovsky, O.: Fair attribute classification through latent space de-biasing. In: CVPR, pp. 9301–9310 (2021)
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?” Explaining the predictions of any classifier. In: SIGKDD, pp. 1135–1144 (2016)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: CVPR, pp. 10684–10695 (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Sablayrolles, A., Douze, M., Schmid, C., Jégou, H.: Radioactive data: tracing through training. In: ICML, pp. 8326–8335 (2020)
Saha, A., Subramanya, A., Pirsiavash, H.: Hidden trigger backdoor attacks. In: AAAI, vol. 34, pp. 11957–11965 (2020)
Schwarzschild, A., Goldblum, M., Gupta, A., Dickerson, J.P., Goldstein, T.: Just how toxic is data poisoning? A unified benchmark for backdoor and data poisoning attacks. In: ICML, pp. 9389–9398 (2021)
Shafahi, A., et al.: Poison frogs! Targeted clean-label poisoning attacks on neural networks. In: NeurIPS, vol. 31 (2018)
Souri, H., Fowl, L., Chellappa, R., Goldblum, M., Goldstein, T.: Sleeper agent: scalable hidden trigger backdoors for neural networks trained from scratch. In: NeurIPS, vol. 35, pp. 19165–19178 (2022)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: ICML, pp. 6105–6114 (2019)
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: CVPR, pp. 648–656 (2015)
Touvron, H., et al.: ResMLP: feedforward networks for image classification with data-efficient training. In: ICLR (2021)
Wang, T., Yao, Y., Xu, F., An, S., Tong, H., Wang, T.: An invisible black-box backdoor attack through frequency domain. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13673, pp. 396–413. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19778-9_23
Wang, W., et al.: PVT v2: improved baselines with pyramid vision transformer. Comput. Vis. Media, pp. 1–10 (2022)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP 13(4), 600–612 (2004)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Zhang, J., et al.: Model watermarking for image processing networks. In: AAAI, vol. 34, pp. 12805–12812 (2020)
Zhu, Z., Xie, L., Yuille, A.: Object recognition with and without objects. In: IJCAI, pp. 3609–3615 (2017)
Acknowledgements
This work was partly supported by Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government foundation (24ZB1200, Research of Human-centered Autonomous Intelligence System Original Technology, 40%), the Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government. (MSIT) (RS-2023-00215760, Guide Dog: Development of Navigation AI Technology of a Guidance Robot for the Visually Impaired Person, 30%), and Korea Institute of Marine Science & Technology Promotion (KIMST) funded by the Korea Coast Guard (RS-2023-00238652, Integrated Satellite-based Applications Development for Korea Coast Guard, 30%).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jang, J., Han, B., Kim, J., Youn, CH. (2025). Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-Wise Hidden Bias. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15079. Springer, Cham. https://doi.org/10.1007/978-3-031-72664-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-72664-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72663-7
Online ISBN: 978-3-031-72664-4
eBook Packages: Computer ScienceComputer Science (R0)