Abstract
Machine learning (ML) has revolutionized various industries, but concerns about privacy and security have emerged as significant challenges. Membership inference attacks (MIAs) pose a serious threat by attempting to determine whenever a specific data record was used to train a ML model. In this study, we evaluate three defense strategies against MIAs: data augmentation (DA), dropout with L2 regularization, and differential privacy (DP). Through experiments, we assess the effectiveness of these techniques in mitigating the success of MIAs while maintaining acceptable model accuracy. Our findings demonstrate that DA not only improves model accuracy but also enhances privacy protection. The dropout and L2 regularization approach effectively reduces the impact of MIAs without compromising accuracy. However, adopting DP introduces a trade-off, as it limits MIA influence but affects model accuracy. Our DA defense strategy, for instance, show promising results, with privacy improvements of 12.97%, 15.82%, and 10.28% for the MNIST, CIFAR-10, and CIFAR-100 datasets, respectively. These insights contribute to the growing field of privacy protection in ML and highlight the significance of safeguarding sensitive data. Further research is needed to advance privacy-preserving techniques and address the evolving landscape of ML security.
Similar content being viewed by others
Data Availability
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
References
(2009) CIFAR-10 and CIFAR-100 datasets. https://www.cs.toronto.edu/kriz/cifar.html
Abadi M, Chu A, Goodfellow I, et al (2016) Deep Learning with Differential Privacy. Proceedings of the 2016 ACM SIGSAC conference on computer and communications security pp 308–318. https://doi.org/10.1145/2976749.2978318, arXiv:1607.00133
Al-Qarafi A, Alrowais F, Alotaibi SS, et al (2022) Optimal machine learning based privacy preserving blockchain assisted internet of things with smart cities environment. Appl Sci 12(12):5893
Al-Rubaie M, Chang JM (2019) Privacy-preserving machine learning: threats and solutions. IEEE Security & Privacy 17(2):49–58. https://doi.org/10.1109/MSEC.2018.2888775, https://ieeexplore.ieee.org/document/8677282
Ben Hamida S, Mrabet H, Belguith S et al (2022) Towards securing machine learning models against membership inference attacks. Computers, Materials & Continua 70(3):4897–4919
Ben Hamida S, Mrabet H, Jemai A (2022b) How differential privacy reinforces privacy of machine learning models? In: Bǎdicǎ C, Treur J, Benslimane D, et al. (eds) Advances in computational collective intelligence. Springer international publishing, cham, communications in computer and information science, pp 661–673, https://doi.org/10.1007/978-3-031-16210-7_54
Bernau D, Robl J, Grassal PW, et al (2021) Comparing local and central differential privacy using membership inference attacks. https://www.springerprofessional.de/en/comparing-local-and-central-differential-privacy-using-membershi/19361800
Biggio B, Roli F (2018) Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recogn 84:317–331. https://doi.org/10.1016/j.patcog.2018.07.023, www.sciencedirect.com/science/article/pii/S0031320318302565
Chen J, Wang WH, Shi X (2021) Differential privacy protection against membership inference attack on machine learning for genomic data. Pacific Symposium Biocomput 26:26–37
Chen L, Yan N, Yang H, et al (2020) A data augmentation method for deep learning based on multi-degree of freedom (DOF) automatic image acquisition. Applied Sciences 10(21):7755, number: 21 Publisher: Multidisciplinary Digital Publishing Institute https://doi.org/10.3390/app10217755, https://www.mdpi.com/2076-3417/10/21/7755,
Choquette-Choo CA, Tramer F, Carlini N, et al. (2021) Label-only membership inference attacks. In: Proceedings of the 38th international conference on machine learning. PMLR, pp 1964–1974. iSSN: 2640–3498, https://proceedings.mlr.press/v139/choquette-choo21a.html
Cortes C, Mohri M, Rostamizadeh A (2012) L2 regularization for learning kernels. arXiv preprint arXiv:1205.2653
Du J, Li S, Chen X, et al. (2021) Dynamic differential-privacy preserving sgd. arXiv preprint arXiv:2111.00173
Dwork C (2006) Differential privacy. In: Automata, languages and programming: 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part II 33, Springer, pp 1–12
Fawcett T (2004) ROC Graphs: Notes and practical considerations for researchers. Mach Learn 31:1–38
Găbudeanu L, Brici I, Mare C et al (2021) Privacy intrusiveness in financial-banking fraud detection. Risks 9(6):104
Galinkin E (2021) The influence of dropout on membership inference in differentially private models. http://arxiv.org/abs/2103.09008arXiv:2103.09008 [cs]
Gastaldi X (2017) Shake-shake regularization. arXiv:1705.07485 [cs]
Hu H, Salcic Z, Sun L, et al. (2021) Membership inference attacks on machine learning: A survey. arXiv:2103.07853 [cs]
Hu H, Salcic Z, Sun L, et al (2022) Membership inference attacks on machine learning: A survey. arXiv:2103.07853
Jarin I, Eshete B (2021) DP-UTIL: Comprehensive utility analysis of differential privacy in machine learning. arXiv:2112.12998 [cs] version: 1
Jayaraman E (2019) Evaluating differentially private machine learning in practice \({|}\) USENIX. https://www.usenix.org/conference/usenixsecurity19/presentation/jayaraman
Jia J, Salem A, Backes M, et al. (2019) MemGuard: Defending against black-box membership inference attacks via adversarial examples. arXiv:1909.10594 [cs]
Kaya Y, Dumitras T (2021) When does data augmentation help with membership inference attacks? In: Proceedings of the 38th international conference on machine learning. PMLR, pp 5345–5355. https://proceedings.mlr.press/v139/kaya21a.html, iSSN: 2640-3498
Kaya Y, Hong S, Dumitras T (2019) Shallow-deep networks: Understanding and mitigating network overthinking. In: Proceedings of the 36th international conference on machine learning. PMLR, pp 3301–3310. https://proceedings.mlr.press/v97/kaya19a.html, iSSN: 2640-3498
Li J, Li N, Ribeiro B (2021a) Membership inference attacks and defenses in classification models. In: Proceedings of the eleventh ACM conference on data and application security and privacy. Association for computing machinery, New York, NY, USA, CODASPY ’21, pp 5–16. https://doi.org/10.1145/3422337.3447836
Li Q, Wen Z, Wu Z, et al. (2021b) A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering pp 1–1. https://doi.org/10.1109/TKDE.2021.3124599, arXiv:1907.09693 [cs, stat]
Li S, Wang Y, Li Y, et al. (2022) l-leaks: Membership inference attacks with logits. arXiv preprint arXiv:2205.06469
Liu C, Chakraborty S, Verma D (2019) Secure model fusion for distributed learning using partial homomorphic encryption. In: Calo S, Bertino E, Verma D (eds) Policy-based autonomic data governance. Lecture notes in computer science, Springer International Publishing, Cham, p 154–179. https://doi.org/10.1007/978-3-030-17277-0_9
Nasr M, Shokri R, Houmansadr A (2018) Machine learning with membership privacy using adversarial regularization. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. Association for computing machinery, New York, NY, USA, CCS ’18, pp 634–646. https://doi.org/10.1145/3243734.3243855
Nasr M, Song S, Thakurta A, et al. (2021) Adversary instantiation: Lower bounds for differentially private machine learning. arXiv:2101.04535 [cs]
Rigaki M, Garcia S (2021) A survey of privacy attacks in machine learning. Tech Rep arXiv:2007.07646, arXiv, arXiv:2007.07646 [cs] type: article
Salem A, Zhang Y, Humbert M, et al. (2018) ML-Leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv:1806.01246 [cs] arXiv: 1806.01246
Shejwalkar V, Houmansadr A (2019) Reconciling utility and membership privacy via knowledge distillation. Undefined https://www.semanticscholar.org/reader/3a8f6430be660a5fc34d589806622cc73c209731
Shokri R, Stronati M, Song C, et al. (2017) Membership inference attacks against machine learning models. arXiv:1610.05820 [cs, stat] arXiv: 1610.05820
Song C, Raghunathan A (2020) Information leakage in embedding models. In: Proceedings of the 2020 ACM SIGSAC conference on computer and communications security. Association for computing machinery, New York, NY, USA, CCS ’20, pp 377–390. https://doi.org/10.1145/3372297.3417270
Srivastava N, Hinton G, Krizhevsky A, et al. (2014) Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15(56):1929–1958. http://jmlr.org/papers/v15/srivastava14a.html
Truex S, Liu L, Gursoy ME, et al. (2019a) Effects of differential privacy and data skewness on membership inference vulnerability. In: 2019 First IEEE international conference on trust, privacy and security in intelligent systems and applications (TPS-ISA), pp 82–91, https://doi.org/10.1109/TPS-ISA48467.2019.00019
Truex S, Liu L, Gursoy ME, et al (2019b) Towards demystifying membership inference attacks. arXiv: 1807.09173
Wilkowska W, Ziefle M (2012) Privacy and data security in e-health: Requirements from the user’s perspective. Health Informatics J 18(3):191–201
Wu D, Qi S, Qi Y et al (2023) Understanding and defending against white-box membership inference attack in deep learning. Knowl-Based Syst 259(110):014
Yang Z, Shao B, Xuan B, et al. (2020) Defending model inversion and membership inference attacks via prediction purification. arXiv:2005.03915 [cs]
Yann L, Corinna C, Christopher JCB (2010) MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges. http://yann.lecun.com/exdb/mnist/
Yeom S, Giacomelli I, Fredrikson M, et al. (2018) Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting. arXiv:1709.01604 [cs, stat]
Yeom S, Giacomelli I, Menaged A et al (2020) Overfitting, robustness, and malicious algorithms: A study of potential causes of privacy risk in machine learning. J Comput Secur 28(1):35–70. https://doi.org/10.3233/JCS-191362, www.medra.org/servlet/aliasResolver?alias=iospress &doi=10.3233/JCS-191362
Ying X (2019) An overview of overfitting and its solutions. J Phys: Conf Ser 1168(022):022. https://doi.org/10.1088/1742-6596/1168/2/022022, https://iopscience.iop.org/article/10.1088/1742-6596/1168/2/022022
Yu D, Zhang H, Chen W, et al. (2021) How does data augmentation affect privacy in machine learning? In: AAAI
Zheng J, Cao Y, Wang H (2021) Resisting membership inference attacks through knowledge distillation. Neurocomputing 452:114–126. https://doi.org/10.1016/j.neucom.2021.04.082, www.sciencedirect.com/science/article/pii/S0925231221006329
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ben Hamida, S., Mrabet, H., Chaieb, F. et al. Assessment of data augmentation, dropout with L2 Regularization and differential privacy against membership inference attacks. Multimed Tools Appl 83, 44455–44484 (2024). https://doi.org/10.1007/s11042-023-17394-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17394-3