Abstract
In this paper we address confidentiality issues in distributed data clustering, particularly the inference problem. We present a measure of inference risk as a function of reconstruction precision and number of colluders in a distributed data mining group. We also present KDEC-S, which is a distributed clustering algorithm designed to provide mining results while preserving confidentiality of original data. The underlying idea of our algorithm is to use an approximation of density estimation such that it is not possible to reconstruct the original data with better probability than some given level.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of 20th ACM Symposium on Principles of Database Systems, Santa Barbara, Califonia, May 2001, pp. 247–255 (2001)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Conference on Management of Data, May 2000, pp. 439–450. ACM Press, New York (2000)
Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: Proceedings of 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX 1999), Chicago,IL, November 1999, pp. 45–52 (1999)
da Silva, J.C., Klusch, M., Lodi, S., Moro, G.: Inference attacks in peer-to-peer homogeneous distributed data mining. In: 16th European Conference on Artificial Intelligence (ECAI 2004), Valencia, Spain (August 2004)
Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: Moskowitz, I.S. (ed.) IH 2001. LNCS, vol. 2137, pp. 369–383. Springer, Heidelberg (2001)
Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proceedings of PODS 2003, San Diego, California, June 9-12, San Diego, California (2003)
Farkas, C., Jajodia, S.: The inference problem: A survey. ACM SIGKDD Explorations Newsletter 4(2), 6–11 (2002)
Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. In: Knowledge Discovery and Data Mining, pp. 58–65 (1998)
Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. In: The ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD 2002) (June 2002)
Klusch, M., Lodi, S., Moro, G.: Agent-based distributed data mining: the KDEC scheme. In: Klusch, M., Bergamaschi, S., Edwards, P., Petta, P. (eds.) Intelligent Information Agents. LNCS (LNAI), vol. 2586, pp. 104–122. Springer, Heidelberg (2003)
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explorations Newsletter 4(2), 12–19 (2002)
Rizvi, S.J., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proceedings of the 28th VLDB – Very Large Data Base Conference, Hong Kong, China, pp. 682–693 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
da Silva, J.C., Klusch, M. (2005). Inference on Distributed Data Clustering. In: Perner, P., Imiya, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2005. Lecture Notes in Computer Science(), vol 3587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11510888_60
Download citation
DOI: https://doi.org/10.1007/11510888_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26923-6
Online ISBN: 978-3-540-31891-0
eBook Packages: Computer ScienceComputer Science (R0)