Abstract
Graph Neural Networks (GNNs) have recently been widely adopted in multiple domains. Yet, they are notably vulnerable to adversarial and backdoor attacks. In particular, backdoor attacks based on subgraph insertion have been shown to be effective in graph classification tasks while being stealthy, successfully circumventing various existing defense methods. In this paper, we propose E-SAGE, a novel approach to defending GNN backdoor attacks based on explainability. We find that the malicious edges and benign edges have significant differences in the importance scores for explainability evaluation. Accordingly, E-SAGE adaptively applies an iterative edge pruning process on the graph based on the edge scores. Through extensive experiments, we demonstrate the effectiveness of E-SAGE against state-of-the-art graph backdoor attacks in different attack settings. In addition, we investigate the effectiveness of E-SAGE against adversarial attacks.
This work was supported in part by the National Natural Science Foundation of China(NSFC) with Grant No. 62172383 and No. 62231015, Anhui Provincial Key R&D Program with Grant No.S202103a05020098, Research Launch Project of University of Science and Technology of China(USTC) with Grant No.KY0110000049.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dai, E., Lin, M., Zhang, X., Wang, S.: Unnoticeable backdoor attacks on graph neural networks. In: Proceedings of the ACM Web Conference 2023, pp. 2263–2273 (2023)
Fan, S., Zhu, J., Han, X., Shi, C., Hu, L., Ma, B., Li, Y.: Metapath-guided heterogeneous graph neural network for intent recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2478–2486 (2019)
Guo, Z., Wang, H.: A deep graph neural network-based mechanism for social recommendations. IEEE Trans. Industr. Inf. 17(4), 2776–2783 (2020)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems. vol. 30 (2017)
Hei, Y., et al.: HAWK: rapid android malware detection through heterogeneous graph attention networks. IEEE Trans. Neural Netw. Learn. Syst. 35(4), 4703–4717 (2021)
Huang, Q., Yamada, M., Tian, Y., Singh, D., Chang, Y.: GraphLIME: local interpretable model explanations for graph neural networks. IEEE Trans. Knowl. Data Eng. 35(7), 6968–6972 (2022)
Jiang, B., Li, Z.: Defending against backdoor attack on graph nerual network by explainability. arXiv preprint arXiv:2209.02902 (2022)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Liu, Y., et al.: Trojaning attack on neural networks. In: 25th Annual Network And Distributed System Security Symposium (NDSS 2018). Internet Soc (2018)
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Sundararajan, M., Taly, A., Yan, Q.: Gradients of counterfactuals. arXiv preprint arXiv:1611.02639 (2016)
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: International Conference on Machine Learning, pp. 3319–3328. PMLR (2017)
Tao, Y., et al.: Byzantine-resilient federated learning at edge. IEEE Trans. Comput. 72(9), 2600–2614 (2023)
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4–24 (2020)
Xi, Z., Pang, R., Ji, S., Wang, T.: Graph backdoor. In: 30th USENIX Security Symposium (USENIX Security 21), pp. 1523–1540 (2021)
Xu, H., Cai, Z., Xiong, Z., Li, W.: Backdoor attack on 3d grey image segmentation. In: 2023 IEEE International Conference on Data Mining (ICDM), pp. 708–717. IEEE (2023)
Xu, J., Xue, M., Picek, S.: Explainability-based backdoor attacks against graph neural networks. In: Proceedings of the 3rd ACM Workshop on Wireless Security and Machine Learning, pp. 31–36 (2021)
Yang, X., Li, G., Han, M.: Persistent clean-label backdoor on graph-based semi-supervised cybercrime detection. In: Goel, S., Nunes de Souza, P.R. (eds.) Digital Forensics and Cyber Crime. ICDF2C 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol. 570, pp. 264–278. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-56580-9_16
Yang, X., Li, G., Zhang, C., Han, M., Yang, W.: PerCBA: persistent clean-label backdoor attacks on semi-supervised graph node classification. In: The IJCAI-23 Workshop on Artificial Intelligence Safety (AISafety 2023), August 21, 2023, Macao S.A.R., China (2023)
Ying, Z., Bourgeois, D., You, J., Zitnik, M., Leskovec, J.: GNNExplainer: generating explanations for graph neural networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Yu, L., et al.: A survey of privacy threats and defense in vertical federated learning: From model life cycle perspective. arXiv preprint arXiv:2402.03688 (2024)
Zhang, J., Yang, Y., Liu, Y., Han, M., Yin, S.: Graph representation learning via adaptive multi-layer neighborhood diffusion contrast. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 4682–4686 (2022)
Zhang, Z., Jia, J., Wang, B., Gong, N.Z.: Backdoor attacks to graph neural networks. In: Proceedings of the 26th ACM Symposium on Access Control Models and Technologies, pp. 15–26 (2021)
Zhou, X., et al.: Hierarchical adversarial attacks against graph-neural-network-based IoT network intrusion detection system. IEEE Internet Things J. 9(12), 9310–9319 (2021)
Zhou, X., et al.: Reconstructed graph neural network with knowledge distillation for lightweight anomaly detection. IEEE Trans. Neural Netw. Learn. Syst. 35(9), 11817–11828 (2024)
Zou, X., et al.: TDGIA: effective injection attacks on graph neural networks. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & ; Data Mining (2021). https://doi.org/10.1145/3447548.3467314, http://dx.doi.org/10.1145/3447548.3467314
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yuan, D., Xu, X., Yu, L., Han, T., Li, R., Han, M. (2025). E-SAGE: Explainability-Based Defense Against Backdoor Attacks on Graph Neural Networks. In: Cai, Z., Takabi, D., Guo, S., Zou, Y. (eds) Wireless Artificial Intelligent Computing Systems and Applications. WASA 2024. Lecture Notes in Computer Science, vol 14997. Springer, Cham. https://doi.org/10.1007/978-3-031-71464-1_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-71464-1_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71463-4
Online ISBN: 978-3-031-71464-1
eBook Packages: Computer ScienceComputer Science (R0)