Abstract
As machine learning (ML) usage becomes more popular in the healthcare sector, there are also increasing concerns about potential biases and risks such as privacy. One countermeasure is to use federated learning (FL) to support collaborative learning without the need for patient data sharing across different organizations. However, the inherent heterogeneity of data distributions among participating FL parties poses challenges for exploring group fairness in FL. While personalization within FL can handle performance degradation caused by data heterogeneity, its influence on group fairness is not fully investigated. Therefore, the primary focus of this study is to rigorously assess the impact of personalized FL on group fairness in the healthcare domain, offering a comprehensive understanding of how personalized FL affects group fairness in clinical outcomes. We conduct an empirical analysis using two prominent real-world Electronic Health Records (EHR) datasets, namely eICU and MIMIC-IV. Our methodology involves a thorough comparison between personalized FL and two baselines: standalone training, where models are developed independently without FL collaboration, and standard FL, which aims to learn a global model via the FedAvg algorithm. We adopt Ditto as our personalized FL approach, which enables each client in FL to develop its own personalized model through multi-task learning. Our assessment is achieved through a series of evaluations, comparing the predictive performance (i.e., AUROC and AUPRC) and fairness gaps (i.e., EOPP, EOD, and DP) of these methods. Personalized FL demonstrates superior predictive accuracy and fairness over standalone training across both datasets. Nevertheless, in comparison with standard FL, personalized FL shows improved predictive accuracy but does not consistently offer better fairness outcomes. For instance, in the 24-h in-hospital mortality prediction task, personalized FL achieves an average EOD of 27.4% across racial groups in the eICU dataset and 47.8% in MIMIC-IV. In comparison, standard FL records a better EOD of 26.2% for eICU and 42.0% for MIMIC-IV, while standalone training yields significantly worse EOD of 69.4% and 54.7% on these datasets, respectively. Our analysis reveals that personalized FL has the potential to enhance fairness in comparison to standalone training, yet it does not consistently ensure fairness improvements compared to standard FL. Our findings also show that while personalization can improve fairness for more biased hospitals (i.e., hospitals having larger fairness gaps in standalone training), it can exacerbate fairness issues for less biased ones. These insights suggest that the integration of personalized FL with additional strategic designs could be key to simultaneously boosting prediction accuracy and reducing fairness disparities. The findings and opportunities outlined in this paper can inform the research agenda for future studies, to overcome the limitations and further advance health equity research.













Similar content being viewed by others
Code Availability
Code will be made available upon request.
References
Purushotham S, Meng C, Che Z, Liu Y (2018) Benchmarking deep learning models on large healthcare datasets. J Biomed Inform 83:112–134
Harutyunyan H, Khachatrian H, Kale DC, Ver Steeg G, Galstyan A (2019) Multitask learning and benchmarking with clinical time series data. Sci Data 6(1):96
Wang S, McDermott MB, Chauhan G, Ghassemi M, Hughes MC, Naumann T (2020) MIMIC-extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III. In: Proceedings of the ACM conference on health, inference, and learning, pp 222–235
Bhatt P, Liu J, Gong Y, Wang J, Guo Y (2022) Emerging artificial intelligence-empowered mhealth: scoping review. JMIR mHealth and uHealth 10(6):35053
Rieke N, Hancox J, Li W, Milletari F, Roth HR, Albarqouni S, Bakas S, Galtier MN, Landman BA, Maier-Hein K et al (2020) The future of digital health with federated learning. NPJ Digit Med 3(1):119
Chen IY, Szolovits P, Ghassemi M (2019) Can AI help reduce disparities in general medical and mental health care? AMA J Ethics 21(2):167–179
Leslie D, Mazumder A, Peppin A, Wolters MK, Hagerty A (2021) Does AI stand for augmenting inequality in the era of covid-19 healthcare? BMJ 372
Braveman P (2006) Health disparities and health equity: concepts and measurement. Annu Rev Public Health 27:167–194
Ghassemi M, Naumann T, Schulam P, Beam AL, Chen IY, Ranganath R (2020) A review of challenges and opportunities in machine learning for health. AMIA Summits Transl Sci Proc 2020:191
Zhang H, Lu AX, Abdalla M, McDermott M, Ghassemi M (2020) Hurtful words: quantifying biases in clinical contextual word embeddings. In: Proceedings of the ACM conference on health, inference, and learning, pp 110–120
Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G (2018) Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med 178(11):1544–1547
Popejoy AB, Ritter DI, Crooks K, Currey E, Fullerton SM, Hindorff LA, Koenig B, Ramos EM, Sorokin EP, Wand H et al (2018) The clinical imperative for inclusivity: race, ethnicity, and ancestry (rea) in genomics. Hum Mutat 39(11):1713–1720
Rajkomar A, Hardt M, Howell MD, Corrado G, Chin MH (2018) Ensuring fairness in machine learning to advance health equity. Ann Intern Med 169(12):866–872
Voigt P, Bussche A (2017) The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing. 10(3152676):10–5555
Health UD, Services H (2013) Others: Modifications to the hipaa privacy, security, enforcement, and breach notification rules under the health information technology for economic and clinical health act and the genetic information nondiscrimination act; other modifications to the hipaa rules. Fed Regist 78(17):5566–5702
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: Proceedings of the 3rd innovations in theoretical computer science conference, pp 214–226
Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 259–268
Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. Adv Neural Inf Process 29
Agarwal A, Dudík M, Wu ZS (2019) Fair regression: Quantitative definitions and reduction-based algorithms. In: International conference on machine learning. PMLR, pp 120–129
Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H (2018) A reductions approach to fair classification. In: International conference on machine learning. PMLR, pp 60–69
Roh Y, Lee K, Whang SE, Suh C (2021) Fairbatch: batch selection for model fairness. In: 9th International conference on learning representations
Chai J, Wang X (2022) Fairness with adaptive weights. In: International conference on machine learning. PMLR, pp 2853–2866
McMahan B, Moore E, Ramage D, Hampson S, Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics. PMLR, pp 1273–1282
Wu X, Huang F, Hu Z, Huang H (2023) Faster adaptive federated learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 10379–10387
Guo Y, Sun Y, Hu R, Gong Y (2022) Hybrid local sgd for federated learning with heterogeneous communications. In: International conference on learning representations
Hu R, Gong Y, Guo Y (2021) Federated learning with sparsification-amplified privacy and adaptive optimization. In: Proceedings of the thirtieth international joint conference on artificial intelligence
Wang T, Du Y, Gong Y, Choo K-KR, Guo Y (2023) Applications of federated learning in mobile health: scoping review. J Med Internet Res 25:43006
Wang T, Guo Y, Choo K-KR (2023) Enabling privacy-preserving prediction for length of stay in ICU-a multimodal federated-learning-based approach. In: European conference on information systems (ECIS)
Cui S, Pan W, Liang J, Zhang C, Wang F (2021) Addressing algorithmic disparity and performance inconsistency in federated learning. Adv Neural Inf Process Syst 34:26091–26102
Du W, Xu D, Wu X, Tong H (2021) Fairness-aware agnostic federated learning. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM). SIAM, pp 181–189
Papadaki A, Martinez N, Bertran M, Sapiro G, Rodrigues M (2022) Minimax demographic group fairness in federated learning. In: 2022 ACM Conference on fairness, accountability, and transparency, pp 142–159
Chang H, Shokri R (2023) Bias propagation in federated learning. In: The Eleventh international conference on learning representations. https://openreview.net/forum?id=V7CYzdruWdm
Smith V, Chiang C-K, Sanjabi M, Talwalkar AS (2017) Federated multi-task learning. Adv Neural Inf Process Syst 30
Li T, Hu S, Beirami A, Smith V (2021) Ditto: fair and robust federated learning through personalization. In: International conference on machine learning. PMLR, pp 6357–6368
Collins L, Hassani H, Mokhtari A, Shakkottai S (2021) Exploiting shared representations for personalized federated learning. In: International conference on machine learning. PMLR, pp 2089–2099
Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V (2018) Federated learning with non-iid data. Preprint at arXiv:1806.00582
Friedler SA, Scheidegger C, Venkatasubramanian S, Choudhary S, Hamilton EP, Roth D (2019) A comparative study of fairness-enhancing interventions in machine learning. In: Proceedings of the conference on fairness, accountability, and transparency, pp 329–338
Blum A, Stangl K (2020) Recovering from biased data: can fairness constraints improve accuracy? In: 1st Symposium on foundations of responsible computing
Zhang BH, Lemoine B, Mitchell M (2018) Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp 335–340
Kim MP, Ghorbani A, Zou J (2019) Multiaccuracy: Black-box post-processing for fairness in classification. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp 247–254
Pfohl S, Marafino B, Coulet A, Rodriguez F, Palaniappan L, Shah NH (2019) Creating fair models of atherosclerotic cardiovascular disease risk. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp 271–278
Pfohl SR, Duan T, Ding DY, Shah NH (2019) Counterfactual reasoning for fair clinical risk prediction. In: Machine learning for healthcare conference. PMLR, pp 325–358
Marcinkevics R, Ozkan E, Vogt JE (2022) Debiasing deep chest x-ray classifiers using intra-and post-processing methods. In: Machine Learning for Healthcare Conference. PMLR, pp 504–536
Ezzeldin YH, Yan S, He C, Ferrara E, Avestimehr AS (2023) Fairfed: Enabling group fairness in federated learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 7494–7502
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning. PMLR, pp 1126–1135
Khodak M, Balcan M-FF, Talwalkar AS (2019) Adaptive gradient-based meta-learning methods. Adv Neural Inf Process Syst 32
Hu R, Guo Y, Li H, Pei Q, Gong Y (2020) Personalized federated learning with differential privacy. IEEE Internet Things J 7(10):9530–9539
Dinh CT, Tran N, Nguyen J (2020) Personalized federated learning with moreau envelopes. Adv Neural Inf Process Syst 33:21394–21405
Li D, Wang J (2019) Fedmd: Heterogenous federated learning via model distillation. Preprint at arXiv:1910.03581
Deng Y, Kamani MM, Mahdavi M (2020) Adaptive personalized federated learning. Preprint at arXiv:2003.13461
Liang PP, Liu T, Ziyin L, Allen NB, Auerbach RP, Brent D, Salakhutdinov R, Morency L-P (2020) Think locally, act globally: Federated learning with local and global representations. Preprint atarXiv:2001.01523
Qin Z, Yao L, Chen D, Li Y, Ding B, Cheng M (2023) Revisiting personalized federated learning: Robustness against backdoor attacks. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD ’23, Association for Computing Machinery, New York, USA, pp 4743–4755
Li X, Jiang M, Zhang X, Kamp M, Dou Q (2021) FedBN: Federated learning on non-IID features via local batch normalization. In: International conference on learning representations. https://openreview.net/forum?id=6YEQUn0QICG
Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020) Federated optimization in heterogeneous networks. Proc Mach Learn Syst 2:429–450
Chen H-Y, Chao W-L (2022) On bridging generic and personalized federated learning for image classification. In: International conference on learning representations. https://openreview.net/forum?id=I1hQbx10Kxn
Fallah A, Mokhtari A, Ozdaglar A (2020) Personalized federated learning with theoretical guarantees: a model-agnostic meta-learning approach. Adv Neural Inf Process Syst 33:3557–3568
Li C, Niu D, Jiang B, Zuo X, Yang J (2021) Meta-har: Federated representation learning for human activity recognition. In: Proceedings of the web conference 2021, pp 912–922
Wu Q, Chen X, Zhou Z, Zhang J (2020) Fedhome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Trans Mob Comput 21(8):2818–2832
Pollard TJ, Johnson AE, Raffa JD, Celi LA, Mark RG, Badawi O (2018) The eicu collaborative research database, a freely available multi-center database for critical care research. Sci Data 5(1):1–13
Rocheteau E, Liò P, Hyland S (2021) Temporal pointwise convolutional networks for length of stay prediction in the intensive care unit. In: Proceedings of the conference on health, inference, and learning, pp 58–68
Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453
Mauvais-Jarvis F, Merz NB, Barnes PJ, Brinton RD, Carrero J-J, DeMeo DL, De Vries GJ, Epperson CN, Govindan R, Klein SL et al (2020) Sex and gender: modifiers of health, disease, and medicine. Lancet 396(10250):565–582
Che Z, Purushotham S, Cho K, Sontag D, Liu Y (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8(1):1–12
Johnson A, Bulgarelli L, Pollard T, Horng S, Celi LA, Mark R (2020) Mimic-iv (version 0.4). PhysioNet. Available online at: https://physionet.org/content/mimiciv/0.4/. Accessed 13 Aug 2020
Hsu T-MH, Qi H, Brown M (2019) Measuring the effects of non-identical data distribution for federated visual classification. Preprint arXiv:1909.06335
Poulain R, Bin Tarek MF, Beheshti R (2023) Improving fairness in ai models on electronic health records: the case for federated learning methods. In: Proceedings of the 2023 ACM conference on fairness, accountability, and transparency, pp 1599–1608
Kalchbrenner N, Espeholt L, Simonyan K, Oord Avd, Graves A, Kavukcuoglu K (2016) Neural machine translation in linear time. Preprint at arXiv:1610.10099
Oord Avd, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: A generative model for raw audio. Preprint arXiv:1609.03499
Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In: International conference on machine learning. PMLR, pp 3319–3328
Funding
The work of Y. Guo was partially supported by NSF CNS-2106761, CMMI-2222670, and UTSA Office of the Vice President for Research, Economic Development, and Knowledge Enterprise. The work of Y. Gong was partially supported by NSF CNS-2047761, CNS-2106761, and Cisco Research Award. The work of J. Cai was partially supported by NSF CMMI-2222670.
Author information
Authors and Affiliations
Contributions
Tongnian Wang: conception, implementation, analysis, and writing. Kai Zhang: writing support, and cross-reading. Jiannan Cai: writing support, and cross-reading. Yanmin Gong: conception, writing support, and cross-reading. Kim-Kwang Raymond Choo: Conception and working as co-supervisor. Yuanxiong Guo: providing ideas and working as supervisor. All authors contributed to the manuscript and reviewed it.
Corresponding author
Ethics declarations
Ethics Approval
Not applicable.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Conflict of Interest
The authors declare no competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, T., Zhang, K., Cai, J. et al. Analyzing the Impact of Personalization on Fairness in Federated Learning for Healthcare. J Healthc Inform Res 8, 181–205 (2024). https://doi.org/10.1007/s41666-024-00164-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41666-024-00164-7