SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation | Multimedia Systems Skip to main content
Log in

SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Adversarial training of lightweight models faces poor effectiveness problem due to the limited model size and the difficult optimization of loss with hard labels. Adversarial distillation is a potential solution to the problem, in which the knowledge from large adversarially pre-trained teachers is used to guide the lightweight models’ learning. However, adversarially pre-training teachers is computationally expensive due to the need for iterative gradient steps concerning the inputs. Additionally, the reliability of guidance from teachers diminishes as lightweight models become more robust. In this paper, we propose an adversarial distillation method called Sample-Adaptive Multi-teacher Dynamic Rectification Adversarial Distillation (SA-MDRAD). First, an adversarial distillation framework of distilling logits and features from the heterogeneous standard pre-trained teachers is developed to reduce pre-training expenses and improve knowledge diversity. Second, the knowledge of teachers is distilled into the lightweight model after sample-aware dynamic rectification and adaptive fusion based on teachers’ predictions to improve the reliability of knowledge. Experiments are conducted to evaluate the performance of the proposed method on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. The results demonstrate that our SA-MDRAD is more effective than existing adversarial distillation methods in enhancing the robustness of lightweight image classification models against various adversarial attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data that support the findings of this study are openly available and can be derived from the following resources available in the public domain at https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz, https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz, and http://cs231n.stanford.edu/tiny-imagenet-200.zip.

References

  1. Amik F, Tasin A, Ahmed S, Elahi M, Mohammed N(2022)Dynamic Rectification Knowledge Distillation.Preprint. https://doi.org/10.48550/arXiv.2201.11319

  2. Boban S, Ivan P, Jovan B, Boban B, Danilo O(2022)Single and multiple drones detection and identification using RF based deep learning algorithm.Expert Systems with Application 187:115928–115943

  3. Bojia Z, Shihao Z, Xingjun M, Yu-Gang J. (2021). Revisiting adversarial robustness distillation: Robust soft labels make student better. Paper presented at the 2021 IEEE/CVF International Conference on Computer Vision(ICCV 2021), Montreal, QC, Canada. https://doi.org/10.48550/arXiv.2108.07969

  4. Cao G, Wang Z, Dong X, Zhang Z, Guo H, Qin Z, Ren K(2022)Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training.Preprint. https://doi.org/10.48550/arXiv.2206.02158

  5. Carlini N, Wagner D. (2017). Towards evaluating the robustness of neural networks. Paper presented at the Proceedings - IEEE Symposium on Security and Privacy. https://doi.org/10.1109/SP.2017.49, arXiv:1608.04644

  6. Chen T, Zhang Z, Liu S, Chang S, Wang Z. (2021). Robust overfitting may be mitigated by properly learned smoothening. Paper presented at the International Conference on Learning Representations

  7. Cohen J, Rosenfeld E, Kolter JZ. (2019). Certified Adversarial Robustness via Randomized Smoothing. Paper presented at the International Conference on Machine Learning(ICML). https://doi.org/10.48550/arXiv.1902.02918

  8. Francesco, C., Matthias, H.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. Paper presented at the International Conference on Machine Learning (2020). https://doi.org/10.48550/arXiv.2003.01690

    Article  Google Scholar 

  9. Geoffrey H, Oriol V, Jeff D. (2014). Distilling the knowledge in a neural network. Paper presented at the NIPS 2014 Deep Learning Workshop. https://doi.org/10.4140/TCP.n.2015.249

  10. Goldblum, M., Fowl, L., Feizi, S., Goldstein, T.: Adversarially robust distillation. Paper presented at the Proceedings of the AAAI Conference on Artificial Intelligence (2020). https://doi.org/10.1609/aaai.v34i04.5816

    Article  Google Scholar 

  11. Goodfellow IJ, Shlens J, Szegedy C. (Explaining and harnessing adversarial examplesICML, 2015

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016). https://doi.org/10.1109/CVPR.2016.90

    Article  Google Scholar 

  13. Huang B, Chen M, Wang Y, Lu J, Cheng M, Wang W. (2023). Boosting accuracy and robustness of student models via adaptive adversarial distillationProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 202324668–24677

  14. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.: Densely connected convolutional networks. Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.48550/arXiv.1608.06993

    Article  Google Scholar 

  15. Huang K, Sui T, Wu H(2022)3D human pose estimation with multi-scale graph convolution and hierarchical body pooling.MULTIMEDIA SYST 28 (2):403–412. https://doi.org/10.1007/s00530-021-00808-3

  16. Huang, Z., Shen, X., Jun, X., Liu, T., Tian, X., Li, H., Deng, B., Huang, J., Hua, X.: Revisiting Knowledge Distillation: An Inheritance and Exploration Framework. Paper presented at the IEEE Conference on Computer Vision and Pattern Recognition (2021). https://doi.org/10.48550/arXiv.2107.00181

    Article  Google Scholar 

  17. Ioffe S, Szegedy C. (Batch normalization: Accelerating deep network training by reducing internal covariate shiftInternational conference on machine learning, 2015. pmlr, p 448–456

  18. J Z, J Y, B H, J Z, T L, G N, J Z, J X, Yang H.: Reliable adversarial distillation with unreliable teachers. Paper presented at the International Conference On Learning Representations (2022). https://doi.org/10.48550/arXiv.2106.04928

    Article  Google Scholar 

  19. Jeong J, Shin J. (2020). Consistency regularization for certified robustness of smoothed classifiers. Paper presented at the Advances in Neural Information Processing Systems(NIPS). https://doi.org/10.48550/arXiv.2006.04062

  20. Jia, X., Wei, X., Cao, X., Hassan, F.: Comdefend: An efficient image compression model to defend adversarial examples. Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.48550/arXiv.1811.12673

    Article  Google Scholar 

  21. Jung J, Jang H, Song J, Lee J. (2023). PeerAiD: Improving Adversarial Distillation from a Specialized Peer TutorProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 202424482–24491

  22. Khattak S, Jan S, Ahmad I, Wadud Z, Khan FQ(2021)An effective security assessment approach for Internet banking services via deep analysis of multimedia data.MULTIMEDIA SYST 27 (4):733–751. https://doi.org/10.1007/s00530-020-00680-7

  23. Krizhevsky A, Hinton G(2009)Learning multiple layers of features from tiny images.Preprint

  24. Li T, Han Y(2023)Improving transferable adversarial attack for vision transformers via global attention and local drop.MULTIMEDIA SYST. https://doi.org/10.1007/s00530-023-01157-z

  25. Liu, C., Salzmann, M., Lin, T., Tomioka, R., Usstrunk, S.: On the loss landscape of adversarial training: Identifying challenges and how to overcome them. Paper presented at the Conference and Workshop on Neural Information Processing Systems (2020). https://doi.org/10.48550/arXiv.2006.08403

    Article  Google Scholar 

  26. Liu S, Tang Y, Tian Y, Su H(2023)Visual driving assistance system based on few-shot learning.MULTIMEDIA SYST 29 (5):2853–2863. https://doi.org/10.1007/s00530-021-00830-5

  27. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. (2018). Towards deep learning models resistant to adversarial attacks. Paper presented at the 6th International Conference on Learning Representations(ICLR). https://doi.org/10.48550/arXiv.1706.06083

  28. Maroto J, Ortiz-Jiménez G, Frossard P(2022)On the benefits of knowledge distillation for adversarial robustness.ArXiv abs/2203.07159

  29. Nakkiran, P.: Adversarial Robustness May Be at Odds With Simplicity. Paper presented at the (2019). https://doi.org/10.48550/arXiv.1901.00532

    Article  Google Scholar 

  30. Pouransari H, Ghili S. (2015). Tiny imagenet visual recognition challenge. Paper presented at the CS231N

  31. Rame, A., Cord, M.: DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation. Paper presented at the International Conference On Learning Representations (2021). https://doi.org/10.48550/arXiv.2101.05544

    Article  Google Scholar 

  32. Romero, A., Ballas, N., Ebrahimi Kahou, S., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets. Paper presented at the International Conference on Learning Representations (2015). https://doi.org/10.48550/arXiv.1412.6550

    Article  Google Scholar 

  33. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. Paper presented at the 018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. https://doi.org/10.1109/CVPR.2018.00474

  34. Shiji Z, Jie Y, Zhenlong S, Bo Z, Xingxing W. (2022). Enhanced Accuracy and Robustness via Multi-teacher Adversarial Distillation. Paper presented at the European Conference on Computer Vision(ECCV)

  35. Simonyan K, Zisserman A(2014)Very deep convolutional networks for large-scale image recognition.Preprint. https://doi.org/10.48550/arXiv.1409.1556

  36. Su, D., Zhang, H., Chen, H., Yi, J., Chen, P.Y., Gao, Y.: Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models. Paper presented at the European Conference on Computer Vision (2018). https://doi.org/10.48550/arXiv.1808.01688

    Article  Google Scholar 

  37. Sukumar A, Subramaniyaswamy V, Ravi L, Vijayakumar V, Indragandhi V(2021)Robust image steganography approach based on RIWT-Laplacian pyramid and histogram shifting using deep learning.MULTIMEDIA SYST 27 (4):651–666. https://doi.org/10.1007/s00530-020-00665-6

  38. Sun, B., Tsai, N., Liu, F., Yu, R., Hao, S.: Adversarial defense by stratified convolutional sparse coding. Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.48550/arXiv.1812.00037

    Article  Google Scholar 

  39. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R(2013)Intriguing properties of neural networks.CoRRabs/1312.6199

  40. Ta N, Chen H, Liu X, Jin N(2023)LET-Net: locally enhanced transformer network for medical image segmentation.MULTIMEDIA SYST. https://doi.org/10.1007/s00530-023-01165-z

  41. Tian, Y., Krishnan, D., Isola, P.: Contrastive representation distillation. Paper presented at the International Conference On Learning Representations (2020). https://doi.org/10.48550/arXiv.1910.10699

    Article  Google Scholar 

  42. Tripathy JK, Chakkaravarthy SS, Satapathy SC, Sahoo M, Vaidehi V(2022)ALBERT-based fine-tuning model for cyberbullying analysis.MULTIMEDIA SYST 28 (6):1941–1949. https://doi.org/10.1007/s00530-020-00690-5

  43. Wang Y, Zou D, Yi J. (Improving adversarial robustness requires revisiting misclassified examples

  44. Wang Y, Zou D, Yi J, James B, Ma X. (2020). Improving adversarial robustness requires revisiting misclassified examples. Paper presented at the International Conference On Learning Representations

  45. Wu, B., Chen, J., Cai, D., He, X., Gu, Q.: Does network width really help adversarial robustness? Paper presented at the (2020). https://doi.org/10.48550/arXiv.2010.01279

    Article  Google Scholar 

  46. Yang H, Zhang J, Dong H, Inkawhich N, Gardner AB, Touchet A, Wilkes W, Berry H, Li HH(2020)DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles.ArXivabs/2009.14720

  47. Ye M, Xu J, Nan G, Wang Y(2023)Anomaly detection based on multi-teacher knowledge distillation.J SYST ARCHITECT 138:102861. https://doi.org/10.1016/j.sysarc.2023.102861

  48. Yuan J, He Z. (2020). Ensemble generative cleaning with feedback loops for defending adversarial attacks. Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). https://doi.org/10.1007/s11042-021-10760-z

  49. Zagoruyko S, Komodakis N. (2016). Wide residual networks. Paper presented at the European Conference on Computer Vision 2018 camera ready Machine Learning. https://doi.org/10.48550/arXiv.1605.07146

  50. Zhang H, Chen D, Wang C. (2022). Confidence-Aware Multi-Teacher Knowledge Distillation. Paper presented at the 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). https://doi.org/10.48550/arXiv.2201.00007

  51. Zhang H, Wang Q, Feng G(2023)Artistic image adversarial attack via style perturbation.MULTIMEDIA SYST. https://doi.org/10.1007/s00530-023-01183-x

  52. Zhang, H., Yu, Y., Jiao, J., Xing, E., Jordan, M.: Theoretically principled trade-off between robustness and accuracy. Paper presented at the International conference on machine learning (ICML) (2019). https://doi.org/10.48550/arXiv.1901.08573

    Article  Google Scholar 

  53. Zhang K, Zhou H, Bian H, Zhang W, Yu N(2022)Certified defense against patch attacks via mask-guided randomized smoothing.Science China Information Sciences 65

  54. Zhibo, W., Hengchang, G., Zhifei, Z., Wenxin, L., Zhan, Q., Kui, R.: Feature importance-aware transferable adversarial attacks. Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021). https://doi.org/10.48550/arXiv.2107.14185

    Article  Google Scholar 

Download references

Acknowledgements

This research work was supported in National Key Research and Development Program of China (2021YFB1006201); the Major Science and Technology Project of Henan Province in China (221100211200-02).

Funding

National Key Research and Development Program of China, 2021YFB1006201, Major Science and Technology Project of Henan Province in China, 221100211200-02.

Author information

Authors and Affiliations

Authors

Contributions

SL: proposed the idea and wrote the main manuscript text. XY: performed the data analysis. GC: performed the validation. WL: acquisition of data. HH: methodology. All authors reviewed the manuscript.

Corresponding author

Correspondence to Hongchao Hu.

Ethics declarations

Conflict of interest

The authors declare that there are no competing interests related to the content of this article.

Additional information

Communicated by Haojie Li.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Yang, X., Cheng, G. et al. SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation. Multimedia Systems 30, 225 (2024). https://doi.org/10.1007/s00530-024-01416-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00530-024-01416-7

Keywords

Navigation