Abstract
Social networks have gained tremendous popularity recently. Millions of people use social network apps to share precious moments with friends and family. Users are often asked to provide personal information such as name, gender, and address when using social networks. However, as the social network data are collected, analyzed, and re-published at a large scale, personal information might be misused by unauthorized third parties and even attackers. Therefore, extensive research has been carried out to protect the data from privacy violations in social networks. The most popular technique is graph perturbation, which modifies the local topological structure of a social network user (a vertex) via various randomization techniques before the social graph data is published. Nevertheless, graph anonymization may affect the usability of the data as random noises are introduced, decreasing user experience. Therefore, a trade-off between privacy protection and data usability must be sought. In this paper, we employ various graph and application utility metrics to investigate this trade-off. More specifically, we conduct an empirical study by implementing five state-of-the-art anonymization algorithms to analyze the graph and application utilities on a Facebook and a Twitter dataset. Our results indicate that most anonymization algorithms can partially or conditionally preserve the graph and application utilities and any single anonymization algorithm may not always perform well on different datasets. Finally, drawing on the reviewed graph anonymization techniques, we provide a brief overview on future research directions and challenges involved therein.
Similar content being viewed by others
References
Bailey NT, et al. (1975) The mathematical theory of infectious diseases and its applications. Charles Griffin & Company Ltd, 5a Crendon Street, High Wycombe Bucks HP13 6LE
Bhagat S, Cormode G, Krishnamurthy B, Srivastava D (2009) Class-based graph anonymization for social network data. Proceedings of the VLDB Endowment 2(1):766–777
Casas-Roma J, Herrera-Joancomartí J, Torra V (2013) An algorithm for k-degree anonymity on large networks. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, ACM, pp 671–675
Chen S, Zhou S (2013) Recursive mechanism: towards node differential privacy and unrestricted joins. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, ACM, pp 653–664
Cheng J, Fu AWc, Liu J (2010) k-isomorphism: privacy preserving network publication against structural attacks. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, ACM, pp 459–470
Csárdi G., Nepusz T, Airoldi EM (2016) Statistical network analysis with igraph. Springer
Day WY, Li N, Lyu M (2016) Publishing graph degree distribution with node differential privacy. In: Proceedings of the 2016 international conference on management of data, ACM, pp 123–138
Dwork C (2008) Differential privacy: a survey of results. In: International conference on theory and applications of models of computation, Springer, pp 1–19
Dwork C, McSherry FD (2010) Differential data privacy. US Patent 7,698,250
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3-5):75–174
Hay M, Li C, Miklau G, Jensen D (2009) Accurate estimation of the degree distribution of private networks. In: 2009. ICDM’09. ninth IEEE international conference on data mining, IEEE, pp 169–178
Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proceedings of the VLDB Endowment 1(1):102–114
Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proceedings of the VLDB Endowment 1(1):102–114
Heitmann B, Hermsen F, Decker S (2017) k-rdf-neighbourhood anonymity: combining structural and attribute-based anonymisation for linked data. In: Privon@ ISWC
Henderson K, Gallagher B, Eliassi-Rad T, Tong H, Basu S, Akoglu L, Koutra D, Faloutsos C, Li L (2012) Rolx: structural role extraction & mining in large graphs. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 1231–1239
Ji S, Li W, Srivatsa M, Beyah R (2014) Structural data de-anonymization: Quantification, practice, and implications. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, ACM, pp 1040–1053
Ji S, Mittal P, Beyah R (2016) Graph data anonymization, de-anonymization attacks, and de-anonymizability quantification: a survey. IEEE Commun Surv Tutorials 19(2):1305–1326
Kellaris G, Papadopoulos S (2013) Practical differential privacy via grouping and smoothing. Proceedings of the VLDB endowment 6(5):301–312
Kifer D, Machanavajjhala A (2011) No free lunch in data privacy. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, pp 193–204
Korayem M, Crandall DJ (2013) De-anonymizing users across heterogeneous social computing platforms. In: ICWSM
Leskovec J, Krevl A (2014) SNAP datasets: Stanford large network dataset collection http://snap.stanford.edu/data
Li C, Hay M, Miklau G, Wang Y (2014) A data-and workload-aware algorithm for range queries under differential privacy. Proceedings of the VLDB endowment 7(5):341–352
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, ACM, pp 93–106
Liu Y, Ji S, Mittal P (2016) Smartwalk: Enhancing social network security via adaptive random walks. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, ACM, pp 492–503
Mittal P, Papamanthou C, Song D (2012) Preserving link privacy in social network based systems. arXiv:1208.6189
Narayanan A, Shmatikov V (2008) Robust de-anonymization of large sparse datasets. In: 2008. SP 2008. IEEE symposium on security and privacy, IEEE, pp 111–125
Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: 2009 30th IEEE symposium on security and privacy, IEEE, pp 173–187
Newman ME (2016) Mathematics of networks. The new Palgrave dictionary of economics, pp 1–8
Nguyen BP, Ngo H, Kim J, Kim J (2015) Publishing graph data with subgraph differential privacy. In: International workshop on information security applications, Springer, pp 134–145
Nilizadeh S, Kapadia A, Ahn YY (2014) Community-enhanced de-anonymization of online social networks. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, ACM, pp 537–548
Nissim K, Raskhodnikova S, Smith A (2007) Smooth sensitivity and sampling in private data analysis. In: Proceedings of the thirty-ninth annual ACM symposium on theory of computing, ACM, pp 75–84
Qian J, Li XY, Zhang C, Chen L, Jung T, Han J. (2017) Social network de-anonymization and privacy inference with knowledge graph model. IEEE Transactions on Dependable and Secure Computing
Rong H, Ma T, Tang M, Cao J (2018) A novel subgraph k+-isomorphism method in social network based on graph similarity detection. Soft Comput 22(8):2583–2601
Samarati P, Sweeney L (1998) Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression, Tech. rep., technical report, SRI International
Srivatsa M, Hicks M (2012) Deanonymizing mobility traces: using social network as a side-channel. In: Proceedings of the 2012 ACM conference on computer and communications security, ACM, pp 628–637
Thompson B, Yao D (2009) The union-split algorithm and cluster-based anonymization of social networks. In: Proceedings of the 4th international symposium on information, computer, and communications security, ACM, pp 218–227
Tian W, Mao J, Jiang J, He Z, Zhou Z, Liu J (2018) Deeply understanding structure-based social network de-anonymization. Prog Comput Sci 129:52–58
Wang B, Jia J, Zhang L, Gong NZ (2018) Structure-based sybil detection in social networks via local rule-based propagation. IEEE Transactions on Network Science and Engineering
Wang Q, Zhang Y, Lu X, Wang Z, Qin Z, Ren K (2018) Real-time and spatio-temporal crowd-sourced social network data publishing with differential privacy. IEEE Trans Dependable Secure Comput 15 (4):591–606
Wang Y, Wu X (2013) Preserving differential privacy in degree-correlation based graph generation. Transactions on Data Privacy 6(2):127
Wu X, Hu Z, Fu X, Fu L, Wang X, Lu S (2018) Social network de-anonymization with overlapping communities: Analysis, algorithm and experiments. In: Proc IEEE INFOCOM
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the 2008 SIAM international conference on data mining, SIAM, pp 739–750
Zhang Z, Wang H, Wang C, Fang H (2015) Modeling epidemics spreading on social contact networks. IEEE Trans Emerg Top Comput 3(3):410–419
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: 2008. ICDE 2008. IEEE 24th international conference on data engineering, IEEE, pp 506–515
Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. Proceedings of the VLDB Endowment 2(1):946–957
Funding
This work was partially supported by the US National Science Foundation under grants CNS-1704397, CNS-1704287, and CNS-1704274; the National Science Foundation of China under grants 61832012, 61871466, and 61771289; and the Guangxi Natural Science Foundation Innovation Research Team Project under Grant 2016GXNSFGA380002.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, C., Jiang, H., Cheng, X. et al. Utility analysis on privacy-preservation algorithms for online social networks: an empirical study. Pers Ubiquit Comput 25, 1063–1079 (2021). https://doi.org/10.1007/s00779-019-01287-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-019-01287-0