{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,7,18]],"date-time":"2023-07-18T10:52:16Z","timestamp":1689677536341},"reference-count":30,"publisher":"IGI Global","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,4,1]]},"abstract":"

Privacy concerns often prevent organizations from sharing data for data mining purposes. There has been a rich literature on privacy preserving data mining techniques that can protect privacy and still allow accurate mining. Many such techniques have some parameters that need to be set correctly to achieve the desired balance between privacy protection and quality of mining results. However, there has been little research on how to tune these parameters effectively. This paper studies the problem of tuning the group size parameter for a popular privacy preserving distance-based mining technique: the condensation method. The contributions include: 1) a class-wise condensation method that selects an appropriate group size based on heuristics and avoids generating groups with mixed classes, 2) a rule-based approach that uses binary search and several rules to further optimize the setting for the group size parameter. The experimental results demonstrate the effectiveness of the authors\u2019 approach.<\/p>","DOI":"10.4018\/jisp.2012040102","type":"journal-article","created":{"date-parts":[[2012,12,11]],"date-time":"2012-12-11T16:39:20Z","timestamp":1355243960000},"page":"16-33","source":"Crossref","is-referenced-by-count":3,"title":["Optimizing Privacy-Accuracy Tradeoff for Privacy Preserving Distance-Based Classification"],"prefix":"10.4018","volume":"6","author":[{"given":"Dongjin","family":"Kim","sequence":"first","affiliation":[{"name":"University of Maryland Baltimore County, USA"}]},{"given":"Zhiyuan","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Maryland Baltimore County, USA"}]},{"given":"Aryya","family":"Gangopadhyay","sequence":"additional","affiliation":[{"name":"University of Maryland Baltimore County, USA"}]}],"member":"2432","reference":[{"key":"jisp.2012040102-0","doi-asserted-by":"crossref","unstructured":"Aggarwal, C. C., & Yu, P. S. (2004). A condensation approach to privacy preserving data mining. In Proceedings of the 9th International Conference on Extending Database Technology, Heraklion, Crete, Greece.","DOI":"10.1007\/978-3-540-24741-8_12"},{"key":"jisp.2012040102-1","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-70992-5","author":"C. C.Aggarwal","year":"2008","journal-title":"Privacy-preserving data mining: Models and algorithms"},{"key":"jisp.2012040102-2","doi-asserted-by":"crossref","unstructured":"Agrawal, D., & Aggarwal, C. C. (2001). On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the 20th ACM SIGMOD SIGACT-SIGART Symposium on Principles of Database Systems, Santa Barbara, CA (pp. 247-255).","DOI":"10.1145\/375551.375602"},{"key":"jisp.2012040102-3","doi-asserted-by":"crossref","unstructured":"Agrawal, R., & Srikant, R. (2000). Privacy preserving data mining. In Proceedings of the ACM SIGMOD Conference on Management of Data, Dallas, TX (pp. 439-450).","DOI":"10.1145\/335191.335438"},{"key":"jisp.2012040102-4","unstructured":"Banerjee, M., Chen, Z., & Gangopadhyay, A. (2010). A utility-aware and holistic approach for privacy preserving distributed mining with worst case privacy guarantee. In Proceedings of the Secure Knowledge Management Workshop, New Brunswick, NJ."},{"key":"jisp.2012040102-5","doi-asserted-by":"crossref","unstructured":"Bayardo, R. J., & Agrawal, R. (2005). Data privacy through optimal k-anonymization. In Proceedings of the 21st International Conference on Data Engineering, Tokyo, Japan (pp. 217-228).","DOI":"10.1109\/ICDE.2005.42"},{"key":"jisp.2012040102-6","unstructured":"Chen, K., & Liu, L. (2005). A random rotation perturbation approach to privacy-preserving data classification. In Proceedings of the Fifth IEEE International Conference on Data Mining, Houston, TX (pp. 589-592)."},{"key":"jisp.2012040102-7","doi-asserted-by":"crossref","unstructured":"Dwork, C. (2006). Differential privacy. In Proceedings of 33rd International Colloquium on Automata, Languages and Programming, Part II, Venice Italy (pp. 1-12).","DOI":"10.1007\/11787006_1"},{"key":"jisp.2012040102-8","unstructured":"Federal Trade Commission. (2007). Identity theft resource center: Facts and statistics: Find out more about the nation\u2019s fastest growing crime. Retrieved from http:\/\/www.idtheftcenter.org\/artman2\/publish\/m_facts\/Facts_and_Statistics.shtml"},{"key":"jisp.2012040102-9","unstructured":"Gartner Inc. (2007). Gartner says number of identity theft victims has increased more than 50 percent since 2003. Retrieved from http:\/\/www.gartner.com\/it\/page.jsp?id=501912"},{"key":"jisp.2012040102-10","author":"O.Goldreich","year":"1998","journal-title":"Secure multi-party computation"},{"key":"jisp.2012040102-11","unstructured":"Hettich, S., Blake, C. L., & Merz, C. J. (1998). UCI Repository of machine learning databases. Retrieved from http:\/\/www.ics.uci.edu\/simmlearn\/MLRepository.html"},{"key":"jisp.2012040102-12","doi-asserted-by":"crossref","unstructured":"Huang, Z., Du, W., & Chen, B. (2005). Deriving private information from randomized data. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, MD (pp. 37-48).","DOI":"10.1145\/1066157.1066163"},{"key":"jisp.2012040102-13","doi-asserted-by":"crossref","unstructured":"Kargupta, H., Datta, S., Wang, Q., & Sivakumar, K. (2003). On the privacy preserving properties of random data perturbation techniques. In Proceedings of the Third IEEE International Conference on Data Mining, Melbourne, FL (pp. 99-106).","DOI":"10.1109\/ICDM.2003.1250908"},{"key":"jisp.2012040102-14","unstructured":"Kim, J. J., & Winkler, W. E. (2003). Multiplicative noise for masking continuous data (Tech. Rep. No. 2003-01). Washington, DC: Statistical Research Division, U.S. Bureau of the Census."},{"key":"jisp.2012040102-15","doi-asserted-by":"crossref","unstructured":"LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2005). Incognito: Efficient full-domain k-anonymity. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, MD (pp. 49-60).","DOI":"10.1145\/1066157.1066164"},{"key":"jisp.2012040102-16","doi-asserted-by":"crossref","unstructured":"LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2006a). Mondrian multidimensional k-anonymity. In Proceedings of the 22nd International Conference on Data Engineering, Atlanta, GA (p. 25).","DOI":"10.1109\/ICDE.2006.101"},{"key":"jisp.2012040102-17","doi-asserted-by":"crossref","unstructured":"LeFevre, K., DeWitt, D. J., & Ramakrishnan, R. (2006b). Workload-aware anonymization. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA (pp. 277-286).","DOI":"10.1145\/1150402.1150435"},{"key":"jisp.2012040102-18","doi-asserted-by":"crossref","unstructured":"Li, N., Li, T., & Venkatasubramanian, S. (2007). t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In Proceedings of the 23rd International Conference on Data Engineering, Istanbul, Turkey (pp. 106-115).","DOI":"10.1109\/ICDE.2007.367856"},{"key":"jisp.2012040102-19","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2006.14"},{"issue":"1","key":"jisp.2012040102-20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1217299.1217302","article-title":"L-diversity: Privacy beyond k-anonymity.","volume":"1","author":"A.Machanavajjhala","year":"2007","journal-title":"ACM Transactions on Knowledge Discovery from Data"},{"key":"jisp.2012040102-21","unstructured":"MacVittie, D. (2007, August 31). Javelin 2006 identity fraud report. Network Computing."},{"key":"jisp.2012040102-22","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2008.03.004"},{"key":"jisp.2012040102-23","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-006-0010-5"},{"key":"jisp.2012040102-24","doi-asserted-by":"publisher","DOI":"10.1109\/69.971193"},{"key":"jisp.2012040102-25","doi-asserted-by":"publisher","DOI":"10.1142\/S0218488502001648"},{"key":"jisp.2012040102-26","doi-asserted-by":"publisher","DOI":"10.1142\/S021848850200165X"},{"key":"jisp.2012040102-27","author":"J.Vaidya","year":"2005","journal-title":"Privacy preserving data mining (Advances in Information Security)"},{"key":"jisp.2012040102-28","unstructured":"Wong, R. C. W., Li, J., Fu, A., & Wang, K. (2006). (alpha, k)-anonymity: An enhanced k-anonymity model for privacy preserving data publishing. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA (pp. 754-759)."},{"key":"jisp.2012040102-29","unstructured":"Xiao, X., & Tao, Y. (2006). Anatomy: Simple and effective privacy preservation. In Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea (pp. 139-150)."}],"container-title":["International Journal of Information Security and Privacy"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=68819","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,1]],"date-time":"2022-06-01T22:56:48Z","timestamp":1654124208000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jisp.2012040102"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2012,4,1]]},"references-count":30,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2012,4]]}},"URL":"https:\/\/doi.org\/10.4018\/jisp.2012040102","relation":{},"ISSN":["1930-1650","1930-1669"],"issn-type":[{"value":"1930-1650","type":"print"},{"value":"1930-1669","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,4,1]]}}}