Adversarial training of lightweight models faces poor effectiveness problem due to the limited model size and the difficult optimization of loss with hard labels. Adversarial distillation is a potential solution to the problem, in which the knowledge from large adversarially pre-trained teachers is used to guide the lightweight models’ learning. However, adversarially pre-training teachers is computationally expensive due to the need for iterative gradient steps concerning the inputs. Additionally, the reliability of guidance from teachers diminishes as lightweight models become more robust. In this paper, we propose an adversarial distillation method called Sample-Adaptive Multi-teacher Dynamic Rectification Adversarial Distillation (SA-MDRAD). First, an adversarial distillation framework of distilling logits and features from the heterogeneous standard pre-trained teachers is developed to reduce pre-training expenses and improve knowledge diversity. Second, the knowledge of teachers is distilled into the lightweight model after sample-aware dynamic rectification and adaptive fusion based on teachers’ predictions to improve the reliability of knowledge. Experiments are conducted to evaluate the performance of the proposed method on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. The results demonstrate that our SA-MDRAD is more effective than existing adversarial distillation methods in enhancing the robustness of lightweight image classification models against various adversarial attacks.
Data availability
The data that support the findings of this study are openly available and can be derived from the following resources available in the public domain at https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz, https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz, and http://cs231n.stanford.edu/tiny-imagenet-200.zip.
This research work was supported in National Key Research and Development Program of China (2021YFB1006201); the Major Science and Technology Project of Henan Province in China (221100211200-02).
National Key Research and Development Program of China, 2021YFB1006201, Major Science and Technology Project of Henan Province in China, 221100211200-02.
SL: proposed the idea and wrote the main manuscript text. XY: performed the data analysis. GC: performed the validation. WL: acquisition of data. HH: methodology. All authors reviewed the manuscript.
The authors declare that there are no competing interests related to the content of this article.
Li, S., Yang, X., Cheng, G. et al. SA-MDRAD: sample-adaptive multi-teacher dynamic rectification adversarial distillation. Multimedia Systems 30, 225 (2024). https://doi.org/10.1007/s00530-024-01416-7
DOI: https://doi.org/10.1007/s00530-024-01416-7