SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation

Li, Shuyi; Yang, Xiaohan; Cheng, Guozhen; Liu, Wenyan; Hu, Hongchao

doi:10.1007/s00530-024-01416-7

SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation

Regular Paper
Published: 29 July 2024

Volume 30, article number 225, (2024)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Shuyi Li¹,
Xiaohan Yang¹,
Guozhen Cheng¹,
Wenyan Liu¹ &
…
Hongchao Hu¹

125 Accesses
Explore all metrics

Abstract

Adversarial training of lightweight models faces poor effectiveness problem due to the limited model size and the difficult optimization of loss with hard labels. Adversarial distillation is a potential solution to the problem, in which the knowledge from large adversarially pre-trained teachers is used to guide the lightweight models’ learning. However, adversarially pre-training teachers is computationally expensive due to the need for iterative gradient steps concerning the inputs. Additionally, the reliability of guidance from teachers diminishes as lightweight models become more robust. In this paper, we propose an adversarial distillation method called Sample-Adaptive Multi-teacher Dynamic Rectification Adversarial Distillation (SA-MDRAD). First, an adversarial distillation framework of distilling logits and features from the heterogeneous standard pre-trained teachers is developed to reduce pre-training expenses and improve knowledge diversity. Second, the knowledge of teachers is distilled into the lightweight model after sample-aware dynamic rectification and adaptive fusion based on teachers’ predictions to improve the reliability of knowledge. Experiments are conducted to evaluate the performance of the proposed method on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. The results demonstrate that our SA-MDRAD is more effective than existing adversarial distillation methods in enhancing the robustness of lightweight image classification models against various adversarial attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge

Enhanced Accuracy and Robustness via Multi-teacher Adversarial Distillation

Enhancing BERT Performance: Multi-teacher Adversarial Distillation with Clean and Robust Guidance

Data availability

The data that support the findings of this study are openly available and can be derived from the following resources available in the public domain at https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz, https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz, and http://cs231n.stanford.edu/tiny-imagenet-200.zip.

References

Amik F, Tasin A, Ahmed S, Elahi M, Mohammed N(2022)Dynamic Rectification Knowledge Distillation.Preprint. https://doi.org/10.48550/arXiv.2201.11319
Boban S, Ivan P, Jovan B, Boban B, Danilo O(2022)Single and multiple drones detection and identification using RF based deep learning algorithm.Expert Systems with Application 187:115928–115943
Bojia Z, Shihao Z, Xingjun M, Yu-Gang J. (2021). Revisiting adversarial robustness distillation: Robust soft labels make student better. Paper presented at the 2021 IEEE/CVF International Conference on Computer Vision(ICCV 2021), Montreal, QC, Canada. https://doi.org/10.48550/arXiv.2108.07969
Cao G, Wang Z, Dong X, Zhang Z, Guo H, Qin Z, Ren K(2022)Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training.Preprint. https://doi.org/10.48550/arXiv.2206.02158
Carlini N, Wagner D. (2017). Towards evaluating the robustness of neural networks. Paper presented at the Proceedings - IEEE Symposium on Security and Privacy. https://doi.org/10.1109/SP.2017.49, arXiv:1608.04644
Chen T, Zhang Z, Liu S, Chang S, Wang Z. (2021). Robust overfitting may be mitigated by properly learned smoothening. Paper presented at the International Conference on Learning Representations
Cohen J, Rosenfeld E, Kolter JZ. (2019). Certified Adversarial Robustness via Randomized Smoothing. Paper presented at the International Conference on Machine Learning(ICML). https://doi.org/10.48550/arXiv.1902.02918
Francesco, C., Matthias, H.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. Paper presented at the International Conference on Machine Learning (2020). https://doi.org/10.48550/arXiv.2003.01690
Article Google Scholar
Geoffrey H, Oriol V, Jeff D. (2014). Distilling the knowledge in a neural network. Paper presented at the NIPS 2014 Deep Learning Workshop. https://doi.org/10.4140/TCP.n.2015.249
Goldblum, M., Fowl, L., Feizi, S., Goldstein, T.: Adversarially robust distillation. Paper presented at the Proceedings of the AAAI Conference on Artificial Intelligence (2020). https://doi.org/10.1609/aaai.v34i04.5816
Article Google Scholar
Goodfellow IJ, Shlens J, Szegedy C. (Explaining and harnessing adversarial examplesICML, 2015
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016). https://doi.org/10.1109/CVPR.2016.90
Article Google Scholar
Huang B, Chen M, Wang Y, Lu J, Cheng M, Wang W. (2023). Boosting accuracy and robustness of student models via adaptive adversarial distillationProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 202324668–24677
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.: Densely connected convolutional networks. Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.48550/arXiv.1608.06993
Article Google Scholar
Huang K, Sui T, Wu H(2022)3D human pose estimation with multi-scale graph convolution and hierarchical body pooling.MULTIMEDIA SYST 28 (2):403–412. https://doi.org/10.1007/s00530-021-00808-3
Huang, Z., Shen, X., Jun, X., Liu, T., Tian, X., Li, H., Deng, B., Huang, J., Hua, X.: Revisiting Knowledge Distillation: An Inheritance and Exploration Framework. Paper presented at the IEEE Conference on Computer Vision and Pattern Recognition (2021). https://doi.org/10.48550/arXiv.2107.00181
Article Google Scholar
Ioffe S, Szegedy C. (Batch normalization: Accelerating deep network training by reducing internal covariate shiftInternational conference on machine learning, 2015. pmlr, p 448–456
J Z, J Y, B H, J Z, T L, G N, J Z, J X, Yang H.: Reliable adversarial distillation with unreliable teachers. Paper presented at the International Conference On Learning Representations (2022). https://doi.org/10.48550/arXiv.2106.04928
Article Google Scholar
Jeong J, Shin J. (2020). Consistency regularization for certified robustness of smoothed classifiers. Paper presented at the Advances in Neural Information Processing Systems(NIPS). https://doi.org/10.48550/arXiv.2006.04062
Jia, X., Wei, X., Cao, X., Hassan, F.: Comdefend: An efficient image compression model to defend adversarial examples. Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.48550/arXiv.1811.12673
Article Google Scholar
Jung J, Jang H, Song J, Lee J. (2023). PeerAiD: Improving Adversarial Distillation from a Specialized Peer TutorProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 202424482–24491
Khattak S, Jan S, Ahmad I, Wadud Z, Khan FQ(2021)An effective security assessment approach for Internet banking services via deep analysis of multimedia data.MULTIMEDIA SYST 27 (4):733–751. https://doi.org/10.1007/s00530-020-00680-7
Krizhevsky A, Hinton G(2009)Learning multiple layers of features from tiny images.Preprint
Li T, Han Y(2023)Improving transferable adversarial attack for vision transformers via global attention and local drop.MULTIMEDIA SYST. https://doi.org/10.1007/s00530-023-01157-z
Liu, C., Salzmann, M., Lin, T., Tomioka, R., Usstrunk, S.: On the loss landscape of adversarial training: Identifying challenges and how to overcome them. Paper presented at the Conference and Workshop on Neural Information Processing Systems (2020). https://doi.org/10.48550/arXiv.2006.08403
Article Google Scholar
Liu S, Tang Y, Tian Y, Su H(2023)Visual driving assistance system based on few-shot learning.MULTIMEDIA SYST 29 (5):2853–2863. https://doi.org/10.1007/s00530-021-00830-5
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. (2018). Towards deep learning models resistant to adversarial attacks. Paper presented at the 6th International Conference on Learning Representations(ICLR). https://doi.org/10.48550/arXiv.1706.06083
Maroto J, Ortiz-Jiménez G, Frossard P(2022)On the benefits of knowledge distillation for adversarial robustness.ArXiv abs/2203.07159
Nakkiran, P.: Adversarial Robustness May Be at Odds With Simplicity. Paper presented at the (2019). https://doi.org/10.48550/arXiv.1901.00532
Article Google Scholar
Pouransari H, Ghili S. (2015). Tiny imagenet visual recognition challenge. Paper presented at the CS231N
Rame, A., Cord, M.: DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation. Paper presented at the International Conference On Learning Representations (2021). https://doi.org/10.48550/arXiv.2101.05544
Article Google Scholar
Romero, A., Ballas, N., Ebrahimi Kahou, S., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets. Paper presented at the International Conference on Learning Representations (2015). https://doi.org/10.48550/arXiv.1412.6550
Article Google Scholar
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. Paper presented at the 018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. https://doi.org/10.1109/CVPR.2018.00474
Shiji Z, Jie Y, Zhenlong S, Bo Z, Xingxing W. (2022). Enhanced Accuracy and Robustness via Multi-teacher Adversarial Distillation. Paper presented at the European Conference on Computer Vision(ECCV)
Simonyan K, Zisserman A(2014)Very deep convolutional networks for large-scale image recognition.Preprint. https://doi.org/10.48550/arXiv.1409.1556
Su, D., Zhang, H., Chen, H., Yi, J., Chen, P.Y., Gao, Y.: Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models. Paper presented at the European Conference on Computer Vision (2018). https://doi.org/10.48550/arXiv.1808.01688
Article Google Scholar
Sukumar A, Subramaniyaswamy V, Ravi L, Vijayakumar V, Indragandhi V(2021)Robust image steganography approach based on RIWT-Laplacian pyramid and histogram shifting using deep learning.MULTIMEDIA SYST 27 (4):651–666. https://doi.org/10.1007/s00530-020-00665-6
Sun, B., Tsai, N., Liu, F., Yu, R., Hao, S.: Adversarial defense by stratified convolutional sparse coding. Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.48550/arXiv.1812.00037
Article Google Scholar
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R(2013)Intriguing properties of neural networks.CoRRabs/1312.6199
Ta N, Chen H, Liu X, Jin N(2023)LET-Net: locally enhanced transformer network for medical image segmentation.MULTIMEDIA SYST. https://doi.org/10.1007/s00530-023-01165-z
Tian, Y., Krishnan, D., Isola, P.: Contrastive representation distillation. Paper presented at the International Conference On Learning Representations (2020). https://doi.org/10.48550/arXiv.1910.10699
Article Google Scholar
Tripathy JK, Chakkaravarthy SS, Satapathy SC, Sahoo M, Vaidehi V(2022)ALBERT-based fine-tuning model for cyberbullying analysis.MULTIMEDIA SYST 28 (6):1941–1949. https://doi.org/10.1007/s00530-020-00690-5
Wang Y, Zou D, Yi J. (Improving adversarial robustness requires revisiting misclassified examples
Wang Y, Zou D, Yi J, James B, Ma X. (2020). Improving adversarial robustness requires revisiting misclassified examples. Paper presented at the International Conference On Learning Representations
Wu, B., Chen, J., Cai, D., He, X., Gu, Q.: Does network width really help adversarial robustness? Paper presented at the (2020). https://doi.org/10.48550/arXiv.2010.01279
Article Google Scholar
Yang H, Zhang J, Dong H, Inkawhich N, Gardner AB, Touchet A, Wilkes W, Berry H, Li HH(2020)DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles.ArXivabs/2009.14720
Ye M, Xu J, Nan G, Wang Y(2023)Anomaly detection based on multi-teacher knowledge distillation.J SYST ARCHITECT 138:102861. https://doi.org/10.1016/j.sysarc.2023.102861
Yuan J, He Z. (2020). Ensemble generative cleaning with feedback loops for defending adversarial attacks. Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). https://doi.org/10.1007/s11042-021-10760-z
Zagoruyko S, Komodakis N. (2016). Wide residual networks. Paper presented at the European Conference on Computer Vision 2018 camera ready Machine Learning. https://doi.org/10.48550/arXiv.1605.07146
Zhang H, Chen D, Wang C. (2022). Confidence-Aware Multi-Teacher Knowledge Distillation. Paper presented at the 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). https://doi.org/10.48550/arXiv.2201.00007
Zhang H, Wang Q, Feng G(2023)Artistic image adversarial attack via style perturbation.MULTIMEDIA SYST. https://doi.org/10.1007/s00530-023-01183-x
Zhang, H., Yu, Y., Jiao, J., Xing, E., Jordan, M.: Theoretically principled trade-off between robustness and accuracy. Paper presented at the International conference on machine learning (ICML) (2019). https://doi.org/10.48550/arXiv.1901.08573
Article Google Scholar
Zhang K, Zhou H, Bian H, Zhang W, Yu N(2022)Certified defense against patch attacks via mask-guided randomized smoothing.Science China Information Sciences 65
Zhibo, W., Hengchang, G., Zhifei, Z., Wenxin, L., Zhan, Q., Kui, R.: Feature importance-aware transferable adversarial attacks. Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021). https://doi.org/10.48550/arXiv.2107.14185
Article Google Scholar

Download references

Acknowledgements

This research work was supported in National Key Research and Development Program of China (2021YFB1006201); the Major Science and Technology Project of Henan Province in China (221100211200-02).

Funding

National Key Research and Development Program of China, 2021YFB1006201, Major Science and Technology Project of Henan Province in China, 221100211200-02.

Author information

Authors and Affiliations

Cyberspace Security, The PLA Information Engineering University, Zhengzhou, 450000, China
Shuyi Li, Xiaohan Yang, Guozhen Cheng, Wenyan Liu & Hongchao Hu

Authors

Shuyi Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Guozhen Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Wenyan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hongchao Hu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SL: proposed the idea and wrote the main manuscript text. XY: performed the data analysis. GC: performed the validation. WL: acquisition of data. HH: methodology. All authors reviewed the manuscript.

Corresponding author

Correspondence to Hongchao Hu.

Ethics declarations

Conflict of interest

The authors declare that there are no competing interests related to the content of this article.

Additional information

Communicated by Haojie Li.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, S., Yang, X., Cheng, G. et al. SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation. Multimedia Systems 30, 225 (2024). https://doi.org/10.1007/s00530-024-01416-7

Download citation

Received: 29 October 2023
Accepted: 10 July 2024
Published: 29 July 2024
DOI: https://doi.org/10.1007/s00530-024-01416-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge

Enhanced Accuracy and Robustness via Multi-teacher Adversarial Distillation

Enhancing BERT Performance: Multi-teacher Adversarial Distillation with Clean and Robust Guidance

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge

Enhanced Accuracy and Robustness via Multi-teacher Adversarial Distillation

Enhancing BERT Performance: Multi-teacher Adversarial Distillation with Clean and Robust Guidance

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation