Abstract
This paper proposes a novel non-model sharing-type collaborative learning method for distributed data analysis, in which data are partitioned in both samples and features. Analyzing these types of distributed data are essential tasks in many applications, e.g., medical data analysis and manufacturing data analysis due to privacy and confidentiality concerns. By centralizing the intermediate representations which are individually constructed in each party, the proposed method achieves collaborative analysis without revealing the individual data, while the learning models remain distributed over local parties. Numerical experiments indicate that the proposed method achieves higher recognition performance for artificial and real-world problems than individual analysis.
The present study is supported in part by JST/ACT-I (No. JPMJPR16U6), NEDO and JSPS/Grants-in-Aid for Scientific Research (Nos. 17K12690, 18H03250, 19KK0255).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Available at http://featureselection.asu.edu/datasets.php.
References
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Heidelberg (2006)
Bogdanova, A., Nakai, A., Okada, Y., Imakura, A., Sakurai, T.: Federated learning system without model sharing through integration of dimensional reduced data representations. In: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with IJCAI 2020 (FL-IJCAI 2020) (2020, accepted)
Chillotti, I., Gama, N., Georgieva, M., Izabachène, M.: Faster fully homomorphic encryption: bootstrapping in less than 0.1 seconds. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 3–33. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6_1
Cho, H., Wu, D.J., Berger, B.: Secure genome-wide association analysis using multiparty computation. Nat. Biotechnol. 36(6), 547 (2018)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Hum. Genet. 7(2), 179–188 (1936)
Gentry, C.: Fully homomorphic encryption using ideal lattices. In: Stoc, vol. 9, pp. 169–178 (2009)
Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Naehrig, M., Wernsing, J.: CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In: International Conference on Machine Learning, pp. 201–210 (2016)
He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems, pp. 153–160 (2004)
Imakura, A., Matsuda, M., Ye, X., Sakurai, T.: Complex moment-based supervised eigenmap for dimensionality reduction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3910–3918 (2019)
Imakura, A., Sakurai, T.: Data collaboration analysis for distributed datasets. arXiv preprint arXiv:1902.07535 (2019)
Imakura, A., Sakurai, T.: Data collaboration analysis framework using centralization of individual intermediate representations for distributed data sets. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part A Civil Eng. 6, 04020018 (2020)
Ito, S., Murota, K.: An algorithm for the generalized eigenvalue problem for nonsquare matrix pencils by minimal perturbation approach. SIAM J. Matrix. Anal. Appl. 37, 409–419 (2016)
Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: di Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005). https://doi.org/10.1007/11555827_23
Ji, Z., Lipton, Z.C., Elkan, C.: Differential privacy and machine learning: a survey and review. arXiv preprint arXiv:1412.7584 (2014)
Konečnỳ, J., McMahan, H.B., Ramage, D., Richtarik, P.: Federated optimization: distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016)
Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtarik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. In: NIPS Workshop on Private Multi-Party Machine Learning (2016). https://arxiv.org/abs/1610.05492
LeCun, Y.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
Li, X., Chen, M., Nie, F., Wang, Q.: Locality adaptive discriminant analysis. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 2201–2207. AAAI Press (2017)
van der Maaten, L., Hinton, G., Visualizing data using t-SNE: J. Machine Learn. Res. 9, 2579–2605 (2008)
McMahan, H.B., Moore, E., Ramage, D., Hampson, S., et al.: Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016)
Pearson, K.: LIII. On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philos. Mag. J. Sci. 2(11), 559–572 (1901)
Samaria, F., Harter, A.: Parameterisation of a stochastic model for human face identification. In: Proceeding of IEEE Workshop on Applications of Computer Vision (1994)
Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables (1998)
Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)
Sugiyama, M.: Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. J. Mach. Learn. Res. 8(May), 1027–1061 (2007)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal Stat. Soci. (Series B) 58, 267–288 (1996)
Yang, Q.: GDPR, data shortage and AI (2019). https://aaai.org/Conferences/AAAI-19/invited-speakers/. Invited Talk of The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. 10(2), Article 12 (2019)
Ye, X., Li, H., Imakura, A., Sakurai, T.: Distributed collaborative feature selection based on intermediate representation. In: The 28th International Joint Conference on Artificial Intelligence (IJCAI-19). pp. 4142–4149 (2019)
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-iid data. arXiv preprint arXiv:cs.LG/1806.00582 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Imakura, A., Ye, X., Sakurai, T. (2021). Collaborative Data Analysis: Non-model Sharing-Type Machine Learning for Distributed Data. In: Uehara, H., Yamaguchi, T., Bai, Q. (eds) Knowledge Management and Acquisition for Intelligent Systems. PKAW 2021. Lecture Notes in Computer Science(), vol 12280. Springer, Cham. https://doi.org/10.1007/978-3-030-69886-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-69886-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69885-0
Online ISBN: 978-3-030-69886-7
eBook Packages: Computer ScienceComputer Science (R0)