A Three-Way Knot: Privacy, Fairness, and Predictive Performance Dynamics

Carvalho, Tânia; Moniz, Nuno; Antunes, Luís

doi:10.1007/978-3-031-49008-8_5

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14115))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

928 Accesses
2 Citations

Abstract

As the frontier of machine learning applications moves further into human interaction, multiple concerns arise regarding automated decision-making. Two of the most critical issues are fairness and data privacy. On the one hand, one must guarantee that automated decisions are not biased against certain groups, especially those unprotected or marginalized. On the other hand, one must ensure that the use of personal information fully abides by privacy regulations and that user identities are kept safe. The balance between privacy, fairness, and predictive performance is complex. However, despite their potential societal impact, we still demonstrate a poor understanding of the dynamics between these optimization vectors. In this paper, we study this three-way tension and how the optimization of each vector impacts others, aiming to inform the future development of safe applications. In light of claims that predictive performance and fairness can be jointly optimized, we find this is only possible at the expense of data privacy. Overall, experimental results show that one of the vectors will be penalized regardless of which of the three we optimize. Nonetheless, we find promising avenues for future work in joint optimization solutions, where smaller trade-offs are observed between the three vectors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8464; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

AI’s fairness problem: understanding wrongful discrimination in the context of automated decision-making

Article Open access 16 November 2022

How I Would have been Differently Treated. Discrimination Through the Lens of Counterfactual Fairness

Article Open access 20 March 2023

The Impact of the ‘Right to Be Forgotten’ on Algorithmic Fairness

Notes

1.
Acronym for Fairness, Accountability, and Transparency.

References

Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J., Wallach, H.: A reductions approach to fair classification. In: International Conference on Machine Learning, pp. 60–69. PMLR (2018)
Google Scholar
Agresti, A.: An Introduction To Categorical Data Analysis. Wiley (1996)
Google Scholar
Benavoli, A., Mangili, F., Corani, G., Zaffalon, M., Ruggeri, F.: A bayesian wilcoxon signed-rank test based on the dirichlet process. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, p. II-1026–II-1034. ICML’14, JMLR.org (2014)
Google Scholar
Benavoli, A., Corani, G., Demšar, J., Zaffalon, M.: Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. J. Mach. Learn. Res. 18(1), 2653–2688 (2017)
MathSciNet Google Scholar
Bhanot, K., Qi, M., Erickson, J.S., Guyon, I., Bennett, K.P.: The problem of fairness in synthetic healthcare data. Entropy 23(9), 1165 (2021)
Google Scholar
Bird, S., Dudík, M., Edgar, R., Horn, B., Lutz, R., Milan, V., Sameki, M., Wallach, H., Walker, K.: Fairlearn: A toolkit for assessing and improving fairness in ai. Microsoft, Tech. Rep. MSR-TR-2020-32 (2020)
Google Scholar
Bullwinkel, B., Grabarz, K., Ke, L., Gong, S., Tanner, C., Allen, J.: Evaluating the fairness impact of differentially private synthetic data (2022). arXiv:2205.04321
Carvalho, T., Moniz, N., Faria, P., Antunes, L.: Survey on privacy-preserving techniques for microdata publication. ACM Comput. Surv. (2023). https://doi.org/10.1145/3588765, just Accepted
Carvalho, T., Moniz, N., Faria, P., Antunes, L., Chawla, N.: Privacy-preserving data synthetisation for secure information sharing (2022). arXiv:2212.00484
Caton, S., Haas, C.: Fairness in machine learning: A survey (2020). arXiv:2010.04053
Chakraborty, J., Majumder, S., Menzies, T.: Bias in machine learning software: why? how? what to do? In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 429–440 (2021)
Google Scholar
Chang, H., Shokri, R.: On the privacy risks of algorithmic fairness. In: 2021 IEEE European Symposium on Security and Privacy (EuroS &P). IEEE (2021)
Google Scholar
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, pp. 785–794. ACM (2016). https://doi.org/10.1145/2939672.2939785
Cheng, V., Suriyakumar, V.M., Dullerud, N., Joshi, S., Ghassemi, M.: Can you fake it until you make it? impacts of differentially private synthetic data on downstream classification fairness. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 149–160 (2021)
Google Scholar
De Bruin, J.: Python Record Linkage Toolkit: A toolkit for record linkage and duplicate detection in Python. Zenodo (2019)
Google Scholar
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226 (2012)
Google Scholar
Elazar, Y., Goldberg, Y.: Adversarial removal of demographic attributes from text data (2018). arXiv:1808.06640
Fellegi, I.P., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64(328), 1183–1210 (1969)
Article Google Scholar
Figueira, A., Vaz, B.: Survey on synthetic data generation, evaluation methods and gans. Mathematics 10(15), 2733 (2022)
Google Scholar
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems 29 (2016)
Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Article Google Scholar
Kruschke, J., Liddell, T.: The bayesian new statistics: Two historical trends converge. ssrn electron. j (2015)
Google Scholar
Le Quy, T., Roy, A., Iosifidis, V., Zhang, W., Ntoutsi, E.: A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, p. e1452 (2022)
Google Scholar
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)
Article Google Scholar
Patki, N., Wedge, R., Veeramachaneni, K.: The synthetic data vault. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 399–410 (2016). https://doi.org/10.1109/DSAA.2016.49
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Google Scholar
Peng, K., Chakraborty, J., Menzies, T.: Fairmask: Better fairness via model-based rebalancing of protected attributes. IEEE Trans. Softw. Eng. (2022)
Google Scholar
Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Article Google Scholar
Sun, C., van Soest, J., Dumontier, M.: Improving correlation capture in generating imbalanced data using differentially private conditional gans (2022). arXiv:2206.13787
Torra, V.: Guide to Data Privacy: Models, Technologies. Solutions. Springer Nature (2022)
Google Scholar
Valentim, I., Lourenço, N., Antunes, N.: The impact of data preparation on the fairness of software systems. In: 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), pp. 391–401. IEEE (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Sciences, University of Porto, Porto, Portugal
Tânia Carvalho & Luís Antunes
Lucy Family Institute for Data & Society, University of Notre Dame, Indiana, USA
Nuno Moniz
TekPrivacy, Porto, Portugal
Luís Antunes

Authors

Tânia Carvalho
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Moniz
View author publications
You can also search for this author in PubMed Google Scholar
Luís Antunes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tânia Carvalho .

Editor information

Editors and Affiliations

Lucy Family Institute for Data & Society, Notre Dame, IN, USA
Nuno Moniz
GECAD, Polytechnic of Porto, Porto, Portugal
Zita Vale
GRIA - LIACC, University of Azores, Ponta-Delgada, Portugal
José Cascalho
CISUC, University of Coimbra, Coimbra, Portugal
Catarina Silva
IEETA, University of Aveiro, Aveiro, Portugal
Raquel Sebastião

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Carvalho, T., Moniz, N., Antunes, L. (2023). A Three-Way Knot: Privacy, Fairness, and Predictive Performance Dynamics. In: Moniz, N., Vale, Z., Cascalho, J., Silva, C., Sebastião, R. (eds) Progress in Artificial Intelligence. EPIA 2023. Lecture Notes in Computer Science(), vol 14115. Springer, Cham. https://doi.org/10.1007/978-3-031-49008-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-49008-8_5
Published: 15 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49007-1
Online ISBN: 978-3-031-49008-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Three-Way Knot: Privacy, Fairness, and Predictive Performance Dynamics