Abstract
Transfer-based hard-label black-box adversarial attacks, confront challenges in obtaining pertinent proxy datasets and demanding a substantial query volume to the target model without guaranteeing a high attack success rate. To address the challenges, we introduces the techniques of dual substitute model extraction and embedding space adversarial example search, proposing a novel hard-label black-box adversarial attack approach named Data-Free Dual Substitutes Hard-Label Black-Box Adversarial Attack (DFDS). This approach initially trains a generative adversarial network through adversarial training. This training is achieved without relying on proxy datasets, only depending on the hard-label outputs of the target model. Subsequently, it utilizes natural evolution strategy (NES) to conduct embedding space search for constructing the final adversarial examples. The comprehensive experimental results demonstrate that, under the same query volume, DFDS achieves higher attack success rates compared to baseline methods. In comparison to the state-of-the-art mixed-mechanism hard-label black-box attack approach DFMS-HL, DFDS exhibits significant improvements across the SVHN, CIFAR-10, and CIFAR-100 datasets. Significantly, in the targeted attack scenario on the CIFAR-10 dataset, the success rate reaches 76.59%, representing the highest enhancement of 21.99%.
S. Jiang and Y. He—Contribute equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Addepalli, S., Nayak, G.K., Chakraborty, A., Radhakrishnan, V.B.: Degan: data-enriching GAN for retrieving representative samples from a trained classifier. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 3130–3137 (2020)
Beetham, J., Kardan, N., Mian, A., Shah, M.: Dual student networks for data-free model stealing. arXiv preprint arXiv:2309.10058 (2023)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57 (2017)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, Z., Zhang, T.: Black-box adversarial attack with transferable model-based embedding. arXiv preprint arXiv:1911.07140 (2019)
Ilyas, A., Engstrom, L., Athalye, A., Lin, J.: Black-box adversarial attacks with limited queries and information. In: International Conference on Machine Learning, pp. 2137–2146 (2018)
Kariyappa, S., Prakash, A., Qureshi, M.K.: Maze: data-free model stealing attack using zeroth-order gradient estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13814–13823 (2021)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. In: Handbook of Systemic Autoimmune Diseases, vol. 1, pp. 1–60 (2009)
Lu, J., Issaranon, T., Forsyth, D.: Safetynet: detecting and rejecting adversarial examples robustly. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 446–454 (2017)
Ma, C., Chen, L., Yong, J.H.: Simulating unknown target models for query-efficient black-box attacks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11835–11844 (2021)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning, pp. 1–9 (2011)
Orekondy, T., Schiele, B., Fritz, M.: Knockoff nets: stealing functionality of black-box models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4954–4963 (2019)
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519 (2017)
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 372–387 (2016)
Pham, H.V., et al.: Problems and opportunities in training deep learning software systems: an analysis of variance. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 771–783 (2020)
Qian, S., et al.: Are my deep learning systems fair? an empirical study of fixed-seed training. Adv. Neural. Inf. Process. Syst. 34, 30211–30227 (2021)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Rosenthal, J., Enouen, E., Pham, H.V., Tan, L.: Disguide: disagreement-guided data-free model extraction, pp. 9614–9622 (2023)
Sanyal, S., Addepalli, S., Babu, R.V.: Towards data-free model stealing in a hard label setting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15284–15293 (2022)
Truong, J.B., Maini, P., Walls, R.J., Papernot, N.: Data-free model extraction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4771–4780 (2021)
Zhou, M., Wu, J., Liu, Y., Liu, S., Zhu, C.: DaST: data-free substitute training for adversarial attacks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 234–243 (2020)
Acknowledgments
This research is supported by the National Natural Science Foundation of China (NSFC) under grant number 62172377, the Taishan Scholars Program of Shandong province under grant number tsqn202312102, and the Startup Research Foundation for Distinguished Scholars under grant number 202112016.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jiang, S., He, Y., Zhang, R., Kang, Z., Xia, H. (2024). DFDS: Data-Free Dual Substitutes Hard-Label Black-Box Adversarial Attack. In: Cao, C., Chen, H., Zhao, L., Arshad, J., Asyhari, T., Wang, Y. (eds) Knowledge Science, Engineering and Management. KSEM 2024. Lecture Notes in Computer Science(), vol 14886. Springer, Singapore. https://doi.org/10.1007/978-981-97-5498-4_21
Download citation
DOI: https://doi.org/10.1007/978-981-97-5498-4_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5497-7
Online ISBN: 978-981-97-5498-4
eBook Packages: Computer ScienceComputer Science (R0)