A Three-Way Knot: Privacy, Fairness, and Predictive Performance Dynamics | SpringerLink
Skip to main content

A Three-Way Knot: Privacy, Fairness, and Predictive Performance Dynamics

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2023)

Abstract

As the frontier of machine learning applications moves further into human interaction, multiple concerns arise regarding automated decision-making. Two of the most critical issues are fairness and data privacy. On the one hand, one must guarantee that automated decisions are not biased against certain groups, especially those unprotected or marginalized. On the other hand, one must ensure that the use of personal information fully abides by privacy regulations and that user identities are kept safe. The balance between privacy, fairness, and predictive performance is complex. However, despite their potential societal impact, we still demonstrate a poor understanding of the dynamics between these optimization vectors. In this paper, we study this three-way tension and how the optimization of each vector impacts others, aiming to inform the future development of safe applications. In light of claims that predictive performance and fairness can be jointly optimized, we find this is only possible at the expense of data privacy. Overall, experimental results show that one of the vectors will be penalized regardless of which of the three we optimize. Nonetheless, we find promising avenues for future work in joint optimization solutions, where smaller trade-offs are observed between the three vectors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 8464
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10581
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Acronym for Fairness, Accountability, and Transparency.

References

  1. Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J., Wallach, H.: A reductions approach to fair classification. In: International Conference on Machine Learning, pp. 60–69. PMLR (2018)

    Google Scholar 

  2. Agresti, A.: An Introduction To Categorical Data Analysis. Wiley (1996)

    Google Scholar 

  3. Benavoli, A., Mangili, F., Corani, G., Zaffalon, M., Ruggeri, F.: A bayesian wilcoxon signed-rank test based on the dirichlet process. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, p. II-1026–II-1034. ICML’14, JMLR.org (2014)

    Google Scholar 

  4. Benavoli, A., Corani, G., Demšar, J., Zaffalon, M.: Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. J. Mach. Learn. Res. 18(1), 2653–2688 (2017)

    MathSciNet  Google Scholar 

  5. Bhanot, K., Qi, M., Erickson, J.S., Guyon, I., Bennett, K.P.: The problem of fairness in synthetic healthcare data. Entropy 23(9), 1165 (2021)

    Google Scholar 

  6. Bird, S., Dudík, M., Edgar, R., Horn, B., Lutz, R., Milan, V., Sameki, M., Wallach, H., Walker, K.: Fairlearn: A toolkit for assessing and improving fairness in ai. Microsoft, Tech. Rep. MSR-TR-2020-32 (2020)

    Google Scholar 

  7. Bullwinkel, B., Grabarz, K., Ke, L., Gong, S., Tanner, C., Allen, J.: Evaluating the fairness impact of differentially private synthetic data (2022). arXiv:2205.04321

  8. Carvalho, T., Moniz, N., Faria, P., Antunes, L.: Survey on privacy-preserving techniques for microdata publication. ACM Comput. Surv. (2023). https://doi.org/10.1145/3588765, just Accepted

  9. Carvalho, T., Moniz, N., Faria, P., Antunes, L., Chawla, N.: Privacy-preserving data synthetisation for secure information sharing (2022). arXiv:2212.00484

  10. Caton, S., Haas, C.: Fairness in machine learning: A survey (2020). arXiv:2010.04053

  11. Chakraborty, J., Majumder, S., Menzies, T.: Bias in machine learning software: why? how? what to do? In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 429–440 (2021)

    Google Scholar 

  12. Chang, H., Shokri, R.: On the privacy risks of algorithmic fairness. In: 2021 IEEE European Symposium on Security and Privacy (EuroS &P). IEEE (2021)

    Google Scholar 

  13. Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, pp. 785–794. ACM (2016). https://doi.org/10.1145/2939672.2939785

  14. Cheng, V., Suriyakumar, V.M., Dullerud, N., Joshi, S., Ghassemi, M.: Can you fake it until you make it? impacts of differentially private synthetic data on downstream classification fairness. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 149–160 (2021)

    Google Scholar 

  15. De Bruin, J.: Python Record Linkage Toolkit: A toolkit for record linkage and duplicate detection in Python. Zenodo (2019)

    Google Scholar 

  16. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226 (2012)

    Google Scholar 

  17. Elazar, Y., Goldberg, Y.: Adversarial removal of demographic attributes from text data (2018). arXiv:1808.06640

  18. Fellegi, I.P., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64(328), 1183–1210 (1969)

    Article  Google Scholar 

  19. Figueira, A., Vaz, B.: Survey on synthetic data generation, evaluation methods and gans. Mathematics 10(15), 2733 (2022)

    Google Scholar 

  20. Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems 29 (2016)

    Google Scholar 

  21. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)

    Article  Google Scholar 

  22. Kruschke, J., Liddell, T.: The bayesian new statistics: Two historical trends converge. ssrn electron. j (2015)

    Google Scholar 

  23. Le Quy, T., Roy, A., Iosifidis, V., Zhang, W., Ntoutsi, E.: A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, p. e1452 (2022)

    Google Scholar 

  24. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)

    Article  Google Scholar 

  25. Patki, N., Wedge, R., Veeramachaneni, K.: The synthetic data vault. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 399–410 (2016). https://doi.org/10.1109/DSAA.2016.49

  26. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    Google Scholar 

  27. Peng, K., Chakraborty, J., Menzies, T.: Fairmask: Better fairness via model-based rebalancing of protected attributes. IEEE Trans. Softw. Eng. (2022)

    Google Scholar 

  28. Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  29. Sun, C., van Soest, J., Dumontier, M.: Improving correlation capture in generating imbalanced data using differentially private conditional gans (2022). arXiv:2206.13787

  30. Torra, V.: Guide to Data Privacy: Models, Technologies. Solutions. Springer Nature (2022)

    Google Scholar 

  31. Valentim, I., Lourenço, N., Antunes, N.: The impact of data preparation on the fairness of software systems. In: 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), pp. 391–401. IEEE (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tânia Carvalho .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carvalho, T., Moniz, N., Antunes, L. (2023). A Three-Way Knot: Privacy, Fairness, and Predictive Performance Dynamics. In: Moniz, N., Vale, Z., Cascalho, J., Silva, C., Sebastião, R. (eds) Progress in Artificial Intelligence. EPIA 2023. Lecture Notes in Computer Science(), vol 14115. Springer, Cham. https://doi.org/10.1007/978-3-031-49008-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-49008-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-49007-1

  • Online ISBN: 978-3-031-49008-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics