Preserving Real-World Robustness of Neural Networks Under Sparsity Constraints | SpringerLink
Skip to main content

Preserving Real-World Robustness of Neural Networks Under Sparsity Constraints

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Abstract

Successful deployment of deep neural networks in physical applications requires various resource constraints and real-world robustness considerations to be simultaneously satisfied. While larger models have been shown to inherently yield robustness, they also come with massive demands in computational power, energy, or memory consumption, which renders them unsuitable to be applied on resource-constrained embedded devices. Our work focuses on practical real-world robustness properties of neural networks under such limitations, particularly with memory-related sparsity constraints. We overcome both challenges by efficiently incorporating state-of-the-art data augmentation methods within the model compression pipeline to maintain robustness. We empirically evaluate various dense models and their pruned counterparts on a comprehensive set of real-world robustness evaluation metrics, including out-of-distribution generalization and resilience against universal adversarial patch attacks. We show that implementing data augmentation strategies only during the pruning and finetuning phases is more critical for robustness of networks under sparsity constraints, than aiming for robustness in pre-training overparameterized dense models in the first place. Results demonstrate that our sparse models obtained via data augmentation driven pruning can even outperform dense models that are end-to-end trained with exhaustive data augmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 17159
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10581
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bellec, G., Kappel, D., Maass, W., Legenstein, R.: Deep rewiring: training very sparse deep networks. arXiv preprint arXiv:1711.05136 (2017)

  2. Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. arXiv preprint arXiv:1712.09665 (2017)

  3. Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)

  4. Chen, T., Cheng, Y., Gan, Z., Yuan, L., Zhang, L., Wang, Z.: Chasing sparsity in vision transformers: an end-to-end exploration. Adv. Neural. Inf. Process. Syst. 34, 19974–19988 (2021)

    Google Scholar 

  5. Chen, T., et al.: Sparsity winning twice: better robust generalization from more efficient training. arXiv preprint arXiv:2202.09844 (2022)

  6. Chen, T., et al.: Can you win everything with a lottery ticket? Transactions on Machine Learning Research (2022)

    Google Scholar 

  7. Chiang, P.y., Ni, R., Abdelkader, A., Zhu, C., Studer, C., Goldstein, T.: Certified defenses for adversarial patches. arXiv preprint arXiv:2003.06693 (2020)

  8. Croce, F., et al.: RobustBench: a standardized adversarial robustness benchmark. arXiv preprint arXiv:2010.09670 (2020)

  9. Croce, F., Andriushchenko, M., Singh, N.D., Flammarion, N., Hein, M.: Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 6437–6445 (2022)

    Google Scholar 

  10. Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation policies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  11. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)

    Google Scholar 

  12. DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)

  13. Diffenderfer, J., Bartoldson, B., Chaganti, S., Zhang, J., Kailkhura, B.: A winning hand: compressing deep networks can improve out-of-distribution robustness. Adv. Neural. Inf. Process. Syst. 34, 664–676 (2021)

    Google Scholar 

  14. Doan, B.G., Xue, M., Ma, S., Abbasnejad, E., Ranasinghe, D.C.: TnT attacks! universal naturalistic adversarial patches against deep neural network systems. IEEE Trans. Inf. Forensics Secur. 17, 3816–3830 (2022)

    Article  Google Scholar 

  15. Evci, U., Gale, T., Menick, J., Castro, P.S., Elsen, E.: Rigging the lottery: making all tickets winners. In: International Conference on Machine Learning, pp. 2943–2952 (2020)

    Google Scholar 

  16. Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1625–1634 (2018)

    Google Scholar 

  17. Farina, M., et al.: Sparsity in transformers: a systematic literature review. Neurocomputing, 127468 (2024)

    Google Scholar 

  18. Frankle, J., Carbin, M.: The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018)

  19. Fu, Y., et al.: Drawing robust scratch tickets: subnetworks with inborn robustness are found within randomly initialized networks. Adv. Neural. Inf. Process. Syst. 34, 13059–13072 (2021)

    Google Scholar 

  20. Gale, T., Elsen, E., Hooker, S.: The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574 (2019)

  21. Gui, S., Wang, H., Yang, H., Yu, C., Wang, Z., Liu, J.: Model compression with adversarial robustness: a unified optimization framework. Adv. Neural Inf. Process. Syst. 32 (2019)

    Google Scholar 

  22. Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. . Adv. Neural Inf. Process. Syst. 28 (2015)

    Google Scholar 

  23. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  24. He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1389–1397 (2017)

    Google Scholar 

  25. Hendrycks, D., et al.: The many faces of robustness: a critical analysis of out-of-distribution generalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8340–8349 (2021)

    Google Scholar 

  26. Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261 (2019)

  27. Hendrycks, D., Mu, N., Cubuk, E.D., Zoph, B., Gilmer, J., Lakshminarayanan, B.: AugMix: a simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781 (2019)

  28. Hendrycks, D., et al.: PixMix: dreamlike pictures comprehensively improve safety measures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16783–16792 (2022)

    Google Scholar 

  29. Hoffmann, J., Agnihotri, S., Saikia, T., Brox, T.: Towards improving robustness of compressed CNNs. In: ICML Workshop on Uncertainty and Robustness in Deep Learning, vol. 2 (2021)

    Google Scholar 

  30. Hooker, S., Courville, A., Clark, G., Dauphin, Y., Frome, A.: What do compressed deep neural networks forget? arXiv preprint arXiv:1911.05248 (2019)

  31. Jain, S., et al.: Missingness bias in model debugging. arXiv preprint arXiv:2204.08945 (2022)

  32. Jordao, A., Pedrini, H.: On the effect of pruning on adversarial robustness. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1–11 (2021)

    Google Scholar 

  33. Klaiber Aboitiz, F.J., Legenstein, R., Özdenizci, O.: Interaction of generalization and out-of-distribution detection capabilities in deep neural networks. In: International Conference on Artificial Neural Networks, pp. 248–259 (2023)

    Google Scholar 

  34. Lee, N., Ajanthan, T., Torr, P.H.: SNIP: single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340 (2018)

  35. Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016)

  36. Li, Z., et al.: Understanding robustness lottery: a comparative visual analysis of neural network pruning approaches. arXiv preprint arXiv:2206.07918 (2022)

  37. Liebenwein, L., Baykal, C., Carter, B., Gifford, D., Rus, D.: Lost in pruning: the effects of pruning neural networks beyond test accuracy. Proc. Mach. Learn. Syst. 3, 93–138 (2021)

    Google Scholar 

  38. Liu, S., et al.: Sparse training via boosting pruning plasticity with neuroregeneration. Adv. Neural. Inf. Process. Syst. 34, 9908–9922 (2021)

    Google Scholar 

  39. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)

  40. Matachana, A.G., Co, K.T., Muñoz-González, L., Martinez, D., Lupu, E.C.: Robustness and transferability of universal attacks on compressed models. arXiv preprint arXiv:2012.06024 (2020)

  41. Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11264–11272 (2019)

    Google Scholar 

  42. Özdenizci, O., Legenstein, R.: Training adversarially robust sparse networks via Bayesian connectivity sampling. In: International Conference on Machine Learning, pp. 8314–8324 (2021)

    Google Scholar 

  43. Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. 45, 10346–10357 (2023)

    Article  Google Scholar 

  44. Pan, J., et al.: EdgeViTs: competing light-weight CNNs on mobile devices with vision transformers. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13671, pp. 294–311. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20083-0_18

    Chapter  Google Scholar 

  45. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)

    Google Scholar 

  46. Pintor, M., et al.: ImageNet-patch: a dataset for benchmarking machine learning robustness against adversarial patches. Pattern Recogn. 134, 109064 (2023)

    Article  Google Scholar 

  47. Ramanujan, V., Wortsman, M., Kembhavi, A., Farhadi, A., Rastegari, M.: What’s hidden in a randomly weighted neural network? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11893–11902 (2020)

    Google Scholar 

  48. Salman, H., Ilyas, A., Engstrom, L., Kapoor, A., Madry, A.: Do adversarially robust ImageNet models transfer better? Adv. Neural. Inf. Process. Syst. 33, 3533–3545 (2020)

    Google Scholar 

  49. Sehwag, V., Wang, S., Mittal, P., Jana, S.: HYDRA: pruning adversarially robust neural networks. Adv. Neural. Inf. Process. Syst. 33, 19655–19666 (2020)

    Google Scholar 

  50. Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 1528–1540 (2016)

    Google Scholar 

  51. Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)

  52. Vasiljevic, I., Chakrabarti, A., Shakhnarovich, G.: Examining the impact of blur on recognition by convolutional networks. arXiv preprint arXiv:1611.05760 (2016)

  53. Wu, B., Chen, J., Cai, D., He, X., Gu, Q.: Do wider neural networks really help adversarial robustness? Adv. Neural. Inf. Process. Syst. 34, 7054–7067 (2021)

    Google Scholar 

  54. Xiang, C., Mahloujifar, S., Mittal, P.: \(\{\)PatchCleanser\(\}\): certifiably robust defense against adversarial patches for any image classifier. In: 31st USENIX Security Symposium (USENIX Security 2022), pp. 2065–2082 (2022)

    Google Scholar 

  55. Yang, X., Wei, F., Zhang, H., Zhu, J.: Design and interpretation of universal adversarial patches in face detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 174–191. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_11

    Chapter  Google Scholar 

  56. Ye, S., et al.: Adversarial robustness vs. model compression, or both? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 111–120 (2019)

    Google Scholar 

  57. Yu, M., Zhang, L., Ma, K.: Revisiting data augmentation in model compression: an empirical and comprehensive study. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2023)

    Google Scholar 

  58. Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

  59. Zhu, M., Gupta, S.: To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878 (2017)

Download references

Acknowledgments

This work has been supported by the Graz Center for Machine Learning (GraML) and the “University SAL Labs” initiative of Silicon Austria Labs (SAL) and its Austrian partner universities for applied fundamental research for electronic based systems.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ozan Özdenizci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gritsch, J.V., Legenstein, R., Özdenizci, O. (2024). Preserving Real-World Robustness of Neural Networks Under Sparsity Constraints. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14945. Springer, Cham. https://doi.org/10.1007/978-3-031-70362-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70362-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70361-4

  • Online ISBN: 978-3-031-70362-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics