Abstract
The application of Data Governance (DG) to Federated Machine Learning (FML) could provide a way to produce better Machine Learning models. Nevertheless, such an application is still almost nonexistent in literature. Within a proposal for applying DG to FML, we first present an approach of metadata for FML, to provide accountability and assist with the continuous improvement of models in the federation. Our proposal includes a metadata model for tracing the operations of participants and collecting all information regarding the definition of goals and configuration of FML training processes. Additionally, we present the outline of a metadata management system as part of a broader DG architecture. Finally, we show some use cases of metadata management.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ballet, V., Renard, X., Aigrain, J., et al.: Imperceptible adversarial attacks on tabular data. arXiv:1911.03274 [cs, stat] (2019). http://arxiv.org/abs/1911.03274
Balta, D., et al.: Accountable federated machine learning in government: engineering and management insights. In: Edelmann, N., et al. (eds.) ePart 2021. LNCS, vol. 12849, pp. 125–138. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82824-0_10
Beutel, D.J., Topal, T., Mathur, A., et al.: Flower: a friendly federated learning research framework. arXiv preprint arXiv:2007.14390 (2020)
Chandrasekaran, V., Jia, H., Thudi, A., et al.: SoK: machine learning governance (2021). http://arxiv.org/abs/2109.10870
Desai, H.B., Ozdayi, M.S., Kantarcioglu, M.: BlockFLA: accountable federated learning via hybrid blockchain architecture, pp. 101–112. ACM (2021)
Galtier, M.N., Marini, C.: Substra: a framework for privacy-preserving, traceable and collaborative ml (2019). https://arxiv.org/abs/1910.11567
Hard, A., Rao, K., Mathews, R., et al.: Federated learning for mobile keyboard prediction (2018). http://arxiv.org/abs/1811.03604
Janssen, M., Brous, P., Estevez, E., et al.: Data governance: organizing data for trustworthy artificial intelligence. GIQ 37(3), 101493 (2020)
Kairouz, P., McMahan, H.B., Avent, B., et al.: Advances and open problems in federated learning. Found. Trends ML 14(1–2), 1–210 (2021)
Khatri, V., Brown, C.V.: Designing data governance. CACM 53(1), 148–152 (2010)
Lin, J., Du, M., Liu, J.: Free-riders in Federated Learning: attacks and Defenses. Technical report arXiv:1911.12560 (2019). http://arxiv.org/abs/1911.12560
Liu, Z., Chen, Y., Yu, H., et al.: GTG-shapley: efficient and accurate participant contribution evaluation in federated learning. ACM Trans. Intell. Syst. Technol. 13(4), 60:1–60:21 (2022)
Majeed, U., Hong, C.S.: FLchain: federated learning via MEC-enabled blockchain network. In: 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS), pp. 1–4 (2019)
Naja, I., Markovic, M., Edwards, P., Cottrill, C.: A semantic framework to support AI system accountability and audit. In: Verborgh, R., et al. (eds.) ESWC 2021. LNCS, vol. 12731, pp. 160–176. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77385-4_10
Schad, J., Sambasivan, R., Woodward, C.: Arangopipe, a tool for machine learning meta-data management. Data Sci. 4(2), 85–99 (2021)
Siebert, J., Joeckel, L., Heidrich, J., et al.: Construction of a quality model for machine learning systems. Softw. Qual. J. 2021, 1–29 (2021)
Simon, G., Vincent, T.: A projected stochastic gradient algorithm for estimating shapley value applied in attribute importance. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 97–115. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_6
Souza, R., Azevedo, L., Lourenço, V., et al.: Provenance data in the machine learning lifecycle in computational science and engineering. In: 2019 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS), pp. 1–10 (2019)
Wang, R.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–34 (1996)
Wang, T., Rausch, J., Zhang, C., Jia, R., Song, D.: A principled approach to data valuation for federated learning. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning. LNCS (LNAI), vol. 12500, pp. 153–167. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-8_11
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM TIST 10(2), 12:1–12:19 (2019)
Acknowledgment
Funded by the German Federal Ministry of Education and Research. Project name: KIWI, RefNr: 16KIS1142K.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Peregrina, J.A., Ortiz, G., Zirpins, C. (2022). Towards a Metadata Management System for Provenance, Reproducibility and Accountability in Federated Machine Learning. In: Zirpins, C., et al. Advances in Service-Oriented and Cloud Computing. ESOCC 2022. Communications in Computer and Information Science, vol 1617. Springer, Cham. https://doi.org/10.1007/978-3-031-23298-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-23298-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23297-8
Online ISBN: 978-3-031-23298-5
eBook Packages: Computer ScienceComputer Science (R0)