Abstract
This paper introduces a novel constraint-based hiding model to drastically reduce the preprocessing overhead that is incurred by border-based techniques in the hiding of sensitive frequent itemsets. The proposed model is solved by an efficient constraint-based mining algorithm that pushes a conjunction of antimonotone constraints into an Apriori-like algorithm, for inducing the support theory of non-sensitive frequent itemsets along with its negative border. The patterns induced by the constraint-based mining algorithm can be used in border-based hiding algorithms to construct a sanitized version of the original database, where the sensitive knowledge is concealed. The efficiency of the constraint-based mining algorithm is evaluated on real and synthetic datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The algorithm processes the itemsets levelwise: it firstly determines the frequent itemsets of size 1, then generates all itemesets of size 2 and determines which of these are frequent, etc. At the n-th level, all frequent itemsets of size n-1 are known and an itemset of size n is frequent if all its subsets of size n-1 are frequent (the Apriory property).
- 2.
Non-sensitive itemsets are indifferent to our problem and there is to need to be specified explicitly.
References
Abul, O., Gökçe, H.: Knowledge hiding from tree and graph databases. Data Knowl. Eng. 72, 148–171 (2012)
Aggarwal, C., Yu, P.: Privacy-Preserving Data Mining: Models and Algorithms. Advances in Database Systems. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-70992-5
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB, pp. 487–499 (1994)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)
Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: KDEX Workshop, pp. 45–52. IEEE (1999)
Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D.: Anonymity preserving pattern discovery. VLDB J. 17(4), 703–727 (2008)
Bayardo, R.: Efficiently mining long patterns from databases. In: Proceedings of SIGMOD 1998, pp. 85–93 (1998)
Bodon, F.: A fast APRIORI implementation. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, FIMI 2003, vol. 90, pp. 56–65 (2003)
Bonchi, F., Ferrari, E.: Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press Inc., Boca Raton (2011)
Bonchi, F., Lucchese, C.: On condensed representations of constrained frequent patterns. Knowl. Inf. Syst. 9(2), 180–201 (2006)
Bonchi, F., et al.: Privacy in spatiotemporal data mining. In: Giannotti, F., Pedreschi, D. (eds.) Mobility, Data Mining and Privacy, pp. 297–333. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-75177-9_12
Borgelt, C.: Frequent item set mining. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 2(6), 437–456 (2012)
Boulicaut, J.-F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 399–416. Springer, Boston (2005). https://doi.org/10.1007/0-387-25465-X_18
Bu, S., Lakshmanan, L.V.S., Ng, R.T., Ramesh, G.: Preservation of patterns and input-output privacy. In: ICDE, pp. 696–705 (2007)
Clifton, C.: Protecting against data mining through samples. In: Atluri, V., Hale, J. (eds.) Research Advances in Database and Information Systems Security. ITIFIP, vol. 43, pp. 193–207. Springer, Boston, MA (2000). https://doi.org/10.1007/978-0-387-35508-5_13
Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: Moskowitz, I.S. (ed.) IH 2001. LNCS, vol. 2137, pp. 369–383. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45496-9_27
Delis, A., Verykios, V.S., Tsitsonis, A.A.: A data perturbation approach to sensitive classification rule hiding. In: SAC, pp. 605–609 (2010)
Evfimievski, A.V., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. Inf. Syst. 29(4), 343–364 (2004)
Fredman, M.L., Khachiyan, L.: On the complexity of dualization of monotone disjunctive normal forms. J. Algorithms 21, 618–628 (1996)
Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/
Gkoulalas-Divanis, A., Verykios, V.: Association Rule Hiding for Data Mining. Advances in Database Systems. Springer, Boston (2010). https://doi.org/10.1007/978-1-4419-6569-1
Gkoulalas-Divanis, A., Verykios, V.S.: An integer programming approach for frequent itemset hiding. In: CIKM, pp. 748–757 (2006)
Gkoulalas-Divanis, A., Verykios, V.S.: Exact knowledge hiding through database extension. IEEE Trans. Knowl. Data Eng. 21(5), 699–713 (2009)
Gkoulalas-Divanis, A., Verykios, V.S.: Hiding sensitive knowledge without side effects. Knowl. Inf. Syst. 20(3), 263–299 (2009)
Gurvich, V., Khachiyan, L.: Hiding sensitive knowledge without side effects. Discrete Appl. Math. 96–97, 363–373 (1999)
IBM Basket Data Generator. http://sourceforge.net/projects/ibmquestdatagen/
Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16(9), 1026–1037 (2004)
Kantarcioglu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: KDD, pp. 599–604 (2004)
Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-Cup 2000 organizers’ report: peeling the onion. SIGKDD Explor. 2(2), 86–98 (2000). http://www.ecn.purdue.edu/KDDCUP
Lindell,Y., Pinkas, B.: Privacy preserving data mining. In: CRYPTO, pp. 36–54 (2000)
Menon, S., Sarkar, S.: Minimizing information loss and preserving privacy. Manag. Sci. 53(1), 101–116 (2007)
Menon, S., Sarkar, S., Mukherjee, S.: Maximizing accuracy of shared databases when concealing sensitive patterns. Inf. Syst. Res. 16(3), 256–270 (2005)
Moustakides, G.V., Verykios, V.S.: A maxmin approach for hiding frequent itemsets. Data Knowl. Eng. 65(1), 75–89 (2008)
Oliveira, S.R.M., Zaïane,O.R.: Algorithms for balancing privacy and knowledge discovery in association rule mining. In: IDEAS, pp. 54–65 (2003)
Oliveira, S.R.M., Zaïane,O.R.: Protecting sensitive knowledge by data sanitization. In: ICDM, pp. 613–616 (2003)
Rizvi, S., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: VLDB, pp. 682–693 (2002)
Saygin, Y., Verykios, V.S., Clifton, C.: Using unknowns to prevent discovery of association rules. SIGMOD Rec. 30(4), 45–54 (2001)
Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: KDD, pp. 67–73 (1997)
Stavropoulos, E.C., Verykios, V.S., Kagklis, V.: A transversal hypergraph approach for the frequent itemset hiding problem. Knowl. Inf. Syst. 47(3), 625–645 (2016)
Sun, X., Yu, P.S.: A border-based approach for hiding sensitive frequent itemsets. In: ICDM, pp. 426–433 (2005)
Sun, X., Yu, P.S.: Hiding sensitive frequent itemsets by a border-based approach. JCSE 1(1), 74–94 (2007)
Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Trans. Knowl. Data Eng. 16(4), 434–447 (2004)
Verykios, V.S., Pontikakis, E.D., Theodoridis, Y., Chang, L.: Efficient algorithms for distortion and blocking techniques in association rule hiding. Distrib. Parallel Databases 22(1), 85–104 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Verykios, V.S., Stavropoulos, E.C., Zorkadis, V., Elmagarmid, A.K. (2020). A Constraint-Based Model for the Frequent Itemset Hiding Problem. In: Katsikas, S., Zorkadis, V. (eds) E-Democracy – Safeguarding Democracy and Human Rights in the Digital Age. e-Democracy 2019. Communications in Computer and Information Science, vol 1111. Springer, Cham. https://doi.org/10.1007/978-3-030-37545-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-37545-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37544-7
Online ISBN: 978-3-030-37545-4
eBook Packages: Computer ScienceComputer Science (R0)