Node Injection Link Stealing Attack

Zari, Oualid; Parra-Arnau, Javier; Ünsal, Ayşe; Önen, Melek

doi:10.1007/978-3-031-69651-0_24

Oualid Zari⁹,
Javier Parra-Arnau^10,11,
Ayşe Ünsal⁹ &
…
Melek Önen⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14915))

Included in the following conference series:

International Conference on Privacy in Statistical Databases

412 Accesses

Abstract

We present a stealthy privacy attack that exposes links in Graph Neural Networks (GNNs). Focusing on dynamic GNNs, we propose to inject new nodes and attach them to a particular target node to infer its private edge information. Our approach significantly enhances the $F_1$ score of the attack compared to the current state-of-the-art benchmarks. Specifically, for the Twitch dataset, our method improves the $F_1$ score by 23.75%, and for the Flickr dataset, remarkably, it is more than three times better than the state-of-the-art. We also propose and evaluate defense strategies based on differentially private (DP) mechanisms relying on a newly defined DP notion. These solutions, on average, reduce the effectiveness of the attack by 71.9% while only incurring a minimal utility loss of about 3.2%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8465; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Imperceptible graph injection attack on graph neural networks

Article Open access 09 August 2023

Privacy Attack and Defense in Network Embedding

Deep Insights into Graph Adversarial Learning: An Empirical Study Perspective

Notes

1.
The post-processing property allows arbitrary data-independent transformations to DP outputs without affecting their privacy guarantee [27].

References

Atwood, J., Towsley, D.: Diffusion-convolutional neural networks. In: 30th International Conference on Neural Information Processing Systems (NIPS) (2016)
Google Scholar
Brunet, S., Canard, S., Gambs, S., Olivier, B.: Novel differentially private mechanisms for graphs. IACR Cryptology ePrint Archive, p. 745 (2016)
Google Scholar
Cadwalladr, C., Graham-Harrison, E.: Revealed: 50 million Facebook profiles harvested for Cambridge analytica in major data breach. The Guardian (2018)
Google Scholar
Duddu, V., Boutet, A., Shejwalkar, V.: Quantifying privacy leakage in graph embedding. In: 17th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (MobiQuitous) (2021)
Google Scholar
Dwork, C.: Differential privacy. In: 33rd International Colloquium on Automata, Languages and Programming, ICALP (2006)
Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography (2006)
Google Scholar
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)
MathSciNet Google Scholar
Hay, M., Li, C., Miklau, G., Jensen, D.: Accurate estimation of the degree distribution of private networks. In: IEEE International Conference on Data Mining (2009)
Google Scholar
He, X., Jia, J., Backes, M., Gong, N.Z., Zhang, Y.: Stealing links from graph neural networks. In: USENIX Security Symposium (2021)
Google Scholar
Karwa, V., Slavkovic, A.B.: Differentially private graphical degree sequences and synthetic graphs. In: Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD (2012)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Kossinets, G., Watts, D.J.: Empirical analysis of an evolving social network. Science 311(5757), 88–90 (2006)
Article MathSciNet Google Scholar
Li, K., et al.: Towards practical edge inference attacks against graph neural networks. In: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 (2023)
Google Scholar
Mueller, T.T., Usynin, D., Paetzold, J.C., Rueckert, D., Kaissis, G.: SoK: differential privacy on graph-structured data. CoRR abs/2203.09205 (2022)
Google Scholar
Rozemberczki, B., Sarkar, R.: Twitch gamers: a dataset for evaluating proximity preserving and structural role-based node embeddings. arXiv preprint arXiv:2101.03091 (2021)
Sajadmanesh, S., Shamsabadi, A.S., Bellet, A., Gatica-Perez, D.: GAP: differentially private graph neural networks with aggregation perturbation. CoRR abs/2203.00949 (2022)
Google Scholar
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2009)
Article Google Scholar
Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93 (2008)
Google Scholar
Sun, L., et al.: Adversarial attack and defense on graph data: a survey. arXiv preprint arXiv:1812.10528 (2018)
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: 6th International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Wu, B., Yang, X., Pan, S., Yuan, X.: Adapting membership inference attacks to GNN for graph classification: approaches and implications. In: IEEE International Conference on Data Mining (ICDM) (2021)
Google Scholar
Wu, F., Long, Y., Zhang, C., Li, B.: Linkteller: recovering private edges from graph neural networks via influence analysis. In: 2022 IEEE Symposium on Security and Privacy (SP) (2022)
Google Scholar
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: 7th International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Zeng, H., Zhou, H., Srivastava, A., Kannan, R., Prasanna, V.K.: Graphsaint: graph sampling based inductive learning method. In: 8th International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Zhang, M., Chen, Y.: Link prediction based on graph neural networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Zhu, K., Fioretto, F., Van Hentenryck, P.: Post-processing of differentially private data: a fairness perspective. In: 31st International Joint Conference on Artificial Intelligence, IJCAI (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

EURECOM, Biot, France
Oualid Zari, Ayşe Ünsal & Melek Önen
Karlsruhe Institute of Technology, Karlsruhe, Germany
Javier Parra-Arnau
Universitat Politècnica de Catalunya, Barcelona, Spain
Javier Parra-Arnau

Authors

Oualid Zari
View author publications
You can also search for this author in PubMed Google Scholar
Javier Parra-Arnau
View author publications
You can also search for this author in PubMed Google Scholar
Ayşe Ünsal
View author publications
You can also search for this author in PubMed Google Scholar
Melek Önen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oualid Zari .

Editor information

Editors and Affiliations

Rovira i Virgili University, Tarragona, Tarragona, Spain
Josep Domingo-Ferrer
EURECOM, Biot, France
Melek Önen

A Appendix

1.1 A.1 Differential Privacy

The original definition of Differential Privacy (DP) [5, 7] was introduced in the context of microdata, i.e., databases containing individual records. A central aspect of DP is the concept of neighborhood, originally defined for that data structure.

Definition 3

(Neighboring Databases). Let $\mathcal {D}$ be the class of possible databases. Any two databases $D, D' \in \mathcal {D}$ that differ in one record are called neighbors. For two neighboring databases, $d(D, D') = 1$, where d denotes the Hamming distance.

Definition 4

($(\varepsilon , \delta )$-Differential Privacy [5, 7]). A randomized mechanism $\mathcal {M}$ satisfies $(\varepsilon ,\delta )$-DP with $\varepsilon , \delta \ge 0$ if, for all pairs of neighboring databases $D, D' \in \mathcal {D}$ and for all measurable $\mathcal {O} \subseteq \text {Range}(\mathcal {M})$,

$$ \mathbb {P}\{\mathcal {M}(D) \in \mathcal {O}\} \le e^{\varepsilon } \mathbb {P}\{\mathcal {M}(D') \in \mathcal {O}\} + \delta . $$

In simple terms, the output of a DP mechanism should not reveal the presence or absence of any specific record in the database, up to an exponential factor $\varepsilon $ and additional $\delta $. When each record corresponds to an individual, DP ensures their information remains confidential. A lower $\varepsilon $, known as the privacy budget, provides stronger protection.

The most popular DP mechanism is the Laplace mechanism, which relies on global sensitivity, defined as follows:

Definition 5

(Global Sensitivity [7]). The $L_p$-global sensitivity of a query function $f: \mathcal {D} \rightarrow \mathbb {R}^d$ is defined as

$$ \Delta _p(f) = \max _{\begin{array}{c} D, D' \in \mathcal {D} \end{array}} \Vert f(D) - f(D') \Vert _p, $$

where $D, D'$ are any two neighboring databases.

Definition 6

(Laplace Mechanism [7]). Given any function $f: \mathcal {D} \rightarrow \mathbb {R}^d$, the Laplace mechanism is defined as:

$$ \mathcal {M}_L(D, f(\cdot ), \varepsilon ) = f(D) + (Y_1, \ldots , Y_d), $$

where $Y_i$ are i.i.d. random variables drawn from a Laplace distribution with zero mean and scale $\Delta _1(f)/\varepsilon $.

1.2 A.2 Experimental Setup

Datasets. We evaluated our attack on several real-world datasets used in related research. We include the Flickr dataset [25], where nodes represent images and edges connect nodes with shared properties. Node features contain word representations. We also use two Twitch datasets (TWITCH-FR and TWITCH-RU) [16] to evaluate NILS and Twitch-ES for training GNNs in an inductive setting, as done in [23]. Twitch datasets map follow connections between users and aim to classify if a streamer uses explicit language, using features like preferred games and location. For the transductive setting, where training and testing occur on the same graph, we use three citation network datasets: Cora, Citeseer, and Pubmed [19]. These involve predicting the topic of publications based on textual features and citation relationships.

Models. We follow the approach in [23] for training models and selecting hyperparameters. The authors trained Graph Convolutional Networks (GCNs) using various configurations, including normalization techniques, the number of hidden layers, input and output units, and dropout rates. A grid search strategy identified optimal hyperparameters, evaluating performance on a validation set. By using the same training procedures and hyperparameter tuning strategies, we ensure consistency across studies.

Evaluation Metrics. We employ precision, recall, and the $F_1$ score as our primary evaluation metrics, following [23]. These metrics are suitable for addressing the imbalanced binary classification problem, where the minority class (connected nodes) is of central interest. We select target nodes $V_{\mathcal {A}}$ such that $|V_{\mathcal {A}}|=500$, using uniform random sampling. We explore scenarios where target nodes exhibit either low or high degrees, as discussed in [23, Section V.D.]. Results are averaged over three runs with different random seeds, along with the corresponding standard deviation.

1.3 A.3 Limitations: Depth of the GNN

We examine the impact of increasing the GNN depth on the success rate of the attack for the Twitch-FR dataset. Our findings in Fig. 4 show that as GNN depth increases, the attack’s success rate decreases. This reduction is due to the dilution of the injected node’s influence within the target node’s neighborhood. As GNN depth increases, the model aggregates information from a larger neighborhood, diluting the influence of the injected malicious node and diminishing the attack’s effectiveness.

Compared to LinkTeller [23], as shown in Table 4, NILS outperforms LinkTeller across various GCN depths. For the Twitch-FR dataset, NILS demonstrates higher precision and recall values at GCN depth 3 (precision: $85.1 \pm \scriptstyle 1.2$, recall: $81.6 \pm \scriptstyle 1.2$) compared to LinkTeller at depth 2 (precision: $84.1 \pm \scriptstyle 3.7$, recall: $78.2 \pm \scriptstyle 1.9$). These results highlight the effectiveness of our node injection strategy, consistently outperforming LinkTeller across different GCN depths.

Table 4. Success rates of the attack for different depths in comparison with LinkTeller [23]. We use the all-ones strategy and Twitch-FR dataset.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zari, O., Parra-Arnau, J., Ünsal, A., Önen, M. (2024). Node Injection Link Stealing Attack. In: Domingo-Ferrer, J., Önen, M. (eds) Privacy in Statistical Databases. PSD 2024. Lecture Notes in Computer Science, vol 14915. Springer, Cham. https://doi.org/10.1007/978-3-031-69651-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-69651-0_24
Published: 13 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-69650-3
Online ISBN: 978-3-031-69651-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Node Injection Link Stealing Attack

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Imperceptible graph injection attack on graph neural networks

Privacy Attack and Defense in Network Embedding

Deep Insights into Graph Adversarial Learning: An Empirical Study Perspective

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Differential Privacy

Definition 3

Definition 4

Definition 5

Definition 6

1.2 A.2 Experimental Setup

1.3 A.3 Limitations: Depth of the GNN

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us