Abstract
Successful deployment of deep neural networks in physical applications requires various resource constraints and real-world robustness considerations to be simultaneously satisfied. While larger models have been shown to inherently yield robustness, they also come with massive demands in computational power, energy, or memory consumption, which renders them unsuitable to be applied on resource-constrained embedded devices. Our work focuses on practical real-world robustness properties of neural networks under such limitations, particularly with memory-related sparsity constraints. We overcome both challenges by efficiently incorporating state-of-the-art data augmentation methods within the model compression pipeline to maintain robustness. We empirically evaluate various dense models and their pruned counterparts on a comprehensive set of real-world robustness evaluation metrics, including out-of-distribution generalization and resilience against universal adversarial patch attacks. We show that implementing data augmentation strategies only during the pruning and finetuning phases is more critical for robustness of networks under sparsity constraints, than aiming for robustness in pre-training overparameterized dense models in the first place. Results demonstrate that our sparse models obtained via data augmentation driven pruning can even outperform dense models that are end-to-end trained with exhaustive data augmentation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bellec, G., Kappel, D., Maass, W., Legenstein, R.: Deep rewiring: training very sparse deep networks. arXiv preprint arXiv:1711.05136 (2017)
Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. arXiv preprint arXiv:1712.09665 (2017)
Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)
Chen, T., Cheng, Y., Gan, Z., Yuan, L., Zhang, L., Wang, Z.: Chasing sparsity in vision transformers: an end-to-end exploration. Adv. Neural. Inf. Process. Syst. 34, 19974–19988 (2021)
Chen, T., et al.: Sparsity winning twice: better robust generalization from more efficient training. arXiv preprint arXiv:2202.09844 (2022)
Chen, T., et al.: Can you win everything with a lottery ticket? Transactions on Machine Learning Research (2022)
Chiang, P.y., Ni, R., Abdelkader, A., Zhu, C., Studer, C., Goldstein, T.: Certified defenses for adversarial patches. arXiv preprint arXiv:2003.06693 (2020)
Croce, F., et al.: RobustBench: a standardized adversarial robustness benchmark. arXiv preprint arXiv:2010.09670 (2020)
Croce, F., Andriushchenko, M., Singh, N.D., Flammarion, N., Hein, M.: Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 6437–6445 (2022)
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation policies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Diffenderfer, J., Bartoldson, B., Chaganti, S., Zhang, J., Kailkhura, B.: A winning hand: compressing deep networks can improve out-of-distribution robustness. Adv. Neural. Inf. Process. Syst. 34, 664–676 (2021)
Doan, B.G., Xue, M., Ma, S., Abbasnejad, E., Ranasinghe, D.C.: TnT attacks! universal naturalistic adversarial patches against deep neural network systems. IEEE Trans. Inf. Forensics Secur. 17, 3816–3830 (2022)
Evci, U., Gale, T., Menick, J., Castro, P.S., Elsen, E.: Rigging the lottery: making all tickets winners. In: International Conference on Machine Learning, pp. 2943–2952 (2020)
Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1625–1634 (2018)
Farina, M., et al.: Sparsity in transformers: a systematic literature review. Neurocomputing, 127468 (2024)
Frankle, J., Carbin, M.: The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018)
Fu, Y., et al.: Drawing robust scratch tickets: subnetworks with inborn robustness are found within randomly initialized networks. Adv. Neural. Inf. Process. Syst. 34, 13059–13072 (2021)
Gale, T., Elsen, E., Hooker, S.: The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574 (2019)
Gui, S., Wang, H., Yang, H., Yu, C., Wang, Z., Liu, J.: Model compression with adversarial robustness: a unified optimization framework. Adv. Neural Inf. Process. Syst. 32 (2019)
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. . Adv. Neural Inf. Process. Syst. 28 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1389–1397 (2017)
Hendrycks, D., et al.: The many faces of robustness: a critical analysis of out-of-distribution generalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8340–8349 (2021)
Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261 (2019)
Hendrycks, D., Mu, N., Cubuk, E.D., Zoph, B., Gilmer, J., Lakshminarayanan, B.: AugMix: a simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781 (2019)
Hendrycks, D., et al.: PixMix: dreamlike pictures comprehensively improve safety measures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16783–16792 (2022)
Hoffmann, J., Agnihotri, S., Saikia, T., Brox, T.: Towards improving robustness of compressed CNNs. In: ICML Workshop on Uncertainty and Robustness in Deep Learning, vol. 2 (2021)
Hooker, S., Courville, A., Clark, G., Dauphin, Y., Frome, A.: What do compressed deep neural networks forget? arXiv preprint arXiv:1911.05248 (2019)
Jain, S., et al.: Missingness bias in model debugging. arXiv preprint arXiv:2204.08945 (2022)
Jordao, A., Pedrini, H.: On the effect of pruning on adversarial robustness. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1–11 (2021)
Klaiber Aboitiz, F.J., Legenstein, R., Özdenizci, O.: Interaction of generalization and out-of-distribution detection capabilities in deep neural networks. In: International Conference on Artificial Neural Networks, pp. 248–259 (2023)
Lee, N., Ajanthan, T., Torr, P.H.: SNIP: single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340 (2018)
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016)
Li, Z., et al.: Understanding robustness lottery: a comparative visual analysis of neural network pruning approaches. arXiv preprint arXiv:2206.07918 (2022)
Liebenwein, L., Baykal, C., Carter, B., Gifford, D., Rus, D.: Lost in pruning: the effects of pruning neural networks beyond test accuracy. Proc. Mach. Learn. Syst. 3, 93–138 (2021)
Liu, S., et al.: Sparse training via boosting pruning plasticity with neuroregeneration. Adv. Neural. Inf. Process. Syst. 34, 9908–9922 (2021)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Matachana, A.G., Co, K.T., Muñoz-González, L., Martinez, D., Lupu, E.C.: Robustness and transferability of universal attacks on compressed models. arXiv preprint arXiv:2012.06024 (2020)
Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11264–11272 (2019)
Özdenizci, O., Legenstein, R.: Training adversarially robust sparse networks via Bayesian connectivity sampling. In: International Conference on Machine Learning, pp. 8314–8324 (2021)
Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. 45, 10346–10357 (2023)
Pan, J., et al.: EdgeViTs: competing light-weight CNNs on mobile devices with vision transformers. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13671, pp. 294–311. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20083-0_18
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)
Pintor, M., et al.: ImageNet-patch: a dataset for benchmarking machine learning robustness against adversarial patches. Pattern Recogn. 134, 109064 (2023)
Ramanujan, V., Wortsman, M., Kembhavi, A., Farhadi, A., Rastegari, M.: What’s hidden in a randomly weighted neural network? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11893–11902 (2020)
Salman, H., Ilyas, A., Engstrom, L., Kapoor, A., Madry, A.: Do adversarially robust ImageNet models transfer better? Adv. Neural. Inf. Process. Syst. 33, 3533–3545 (2020)
Sehwag, V., Wang, S., Mittal, P., Jana, S.: HYDRA: pruning adversarially robust neural networks. Adv. Neural. Inf. Process. Syst. 33, 19655–19666 (2020)
Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 1528–1540 (2016)
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Vasiljevic, I., Chakrabarti, A., Shakhnarovich, G.: Examining the impact of blur on recognition by convolutional networks. arXiv preprint arXiv:1611.05760 (2016)
Wu, B., Chen, J., Cai, D., He, X., Gu, Q.: Do wider neural networks really help adversarial robustness? Adv. Neural. Inf. Process. Syst. 34, 7054–7067 (2021)
Xiang, C., Mahloujifar, S., Mittal, P.: \(\{\)PatchCleanser\(\}\): certifiably robust defense against adversarial patches for any image classifier. In: 31st USENIX Security Symposium (USENIX Security 2022), pp. 2065–2082 (2022)
Yang, X., Wei, F., Zhang, H., Zhu, J.: Design and interpretation of universal adversarial patches in face detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 174–191. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_11
Ye, S., et al.: Adversarial robustness vs. model compression, or both? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 111–120 (2019)
Yu, M., Zhang, L., Ma, K.: Revisiting data augmentation in model compression: an empirical and comprehensive study. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2023)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Zhu, M., Gupta, S.: To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878 (2017)
Acknowledgments
This work has been supported by the Graz Center for Machine Learning (GraML) and the “University SAL Labs” initiative of Silicon Austria Labs (SAL) and its Austrian partner universities for applied fundamental research for electronic based systems.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gritsch, J.V., Legenstein, R., Özdenizci, O. (2024). Preserving Real-World Robustness of Neural Networks Under Sparsity Constraints. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14945. Springer, Cham. https://doi.org/10.1007/978-3-031-70362-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-70362-1_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70361-4
Online ISBN: 978-3-031-70362-1
eBook Packages: Computer ScienceComputer Science (R0)