Abstract
Graph Neural Networks (GNNs) have been widely employed for feature representation learning in molecular graphs. Therefore, it is crucial to enhance the expressiveness of feature representation to ensure the effectiveness of GNNs. However, a significant portion of current research primarily focuses on the structural features within individual molecules, often overlooking the structural similarity between molecules, which is a crucial aspect encapsulating rich information on the relationship between molecular properties and structural characteristics. Thus, these approaches fail to capture the rich semantic information at the molecular structure level. To bridge this gap, we introduce the Molecular Structural Similarity Motif GNN (MSSM-GNN), a novel molecular graph representation learning method that can capture structural similarity information among molecules from a global perspective. In particular, we propose a specially designed graph that leverages graph kernel algorithms to represent the similarity between molecules quantitatively. Subsequently, we employ GNNs to learn feature representations from molecular graphs, aiming to enhance the accuracy of property prediction by incorporating additional molecular representation information. Finally, through a series of experiments conducted on both small-scale and large-scale molecular datasets, we demonstrate that our model consistently outperforms eleven state-of-the-art baselines. The codes are available at https://github.com/yaoyao-yaoyao-cell/MSSM-GNN.
C. Yao and H. Huang—These authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Breaking retrosynthetically interesting chemical substructures [10].
- 2.
A six-membered ring compound containing two nitrogen atoms.
References
Atsango, A., Diamant, N.L., Lu, Z., Biancalani, T., Scalia, G., Chuang, K.V.: A 3D-shape similarity-based contrastive approach to molecular representation learning. arXiv preprint arXiv:2211.02130 (2022)
Borgwardt, K.M., Kriegel, H.P.: Shortest-path kernels on graphs. In: Fifth IEEE International Conference on Data Mining (ICDM 2005), pp. 8–pp. IEEE (2005)
Borgwardt, K.M., Ong, C.S., Schönauer, S., Vishwanathan, S., Smola, A.J., Kriegel, H.P.: Protein function prediction via graph kernels. Bioinformatics 21(Suppl_1), i47–i56 (2005)
Bouritsas, G., Frasca, F., Zafeiriou, S., Bronstein, M.M.: Improving graph neural network expressivity via subgraph isomorphism counting. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 657–668 (2022)
Cantoni, V., Gatti, R., Lombardi, L.: Morphological analysis of 3D proteins structure. In: International Conference on Bioinformatics Models, Methods and Algorithms, vol. 2, pp. 15–21. SciTePress (2011)
Chow, W.C.: Brownian bridge. Wiley Interdiscip. Rev. Comput. Stat. 1(3), 325–332 (2009)
Dan, J., et al.: Self-supervision meets kernel graph neural models: from architecture to augmentations. In: 2023 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1076–1083. IEEE (2023)
De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.L.: The mahalanobis distance. Chemometr. Intell. Lab. Syst. 50(1), 1–18 (2000)
Debnath, A.K., Lopez de Compadre, R.L., Debnath, G., Shusterman, A.J., Hansch, C.: Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. J. Med. Chem. 34(2), 786–797 (1991)
Degen, J., Wegscheid-Gerlach, C., Zaliani, A., Rarey, M.: On the art of compiling and using ‘drug-like’ chemical fragment spaces. ChemMedChem Chem. Enabling Drug Discov. 3(10), 1503–1507 (2008)
Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Gao, H., Ji, S.: Graph u-nets. In: International Conference on Machine Learning, pp. 2083–2092. PMLR (2019)
Geng, Z., et al.: De novo molecular generation via connection-aware motif mining. arXiv preprint arXiv:2302.01129 (2023)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Hevapathige, A., Wang, Q.: Uplifting the expressive power of graph neural networks through graph partitioning. arXiv preprint arXiv:2312.08671 (2023)
Hu, W., et al.: Open graph benchmark: datasets for machine learning on graphs. In: Advances in Neural Information Processing Systems, vol. 33, pp. 22118–22133 (2020)
Huang, K., Xiao, C., Hoang, T., Glass, L., Sun, J.: Caster: predicting drug interactions with chemical substructure representation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 702–709 (2020)
Inae, E., Liu, G., Jiang, M.: Motif-aware attribute masking for molecular graph pre-training. arXiv preprint arXiv:2309.04589 (2023)
Jin, W., Barzilay, R., Jaakkola, T.: Hierarchical generation of molecular graphs using structural motifs. In: International Conference on Machine Learning, pp. 4839–4848. PMLR (2020)
Kazius, J., McGuire, R., Bursi, R.: Derivation and validation of toxicophores for mutagenicity prediction. J. Med. Chem. 48(1), 312–320 (2005)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2016)
Kolouri, S., Naderializadeh, N., Rohde, G.K., Hoffmann, H.: Wasserstein embedding for graph learning. arXiv preprint arXiv:2006.09430 (2020)
Liu, Y., Wang, L., Liu, M., Lin, Y., Zhang, X., Oztekin, B., Ji, S.: Spherical message passing for 3D molecular graphs. In: International Conference on Learning Representations (ICLR) (2022)
Maron, H., Ben-Hamu, H., Serviansky, H., Lipman, Y.: Provably powerful graph networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Maziarz, K., et al.: Learning to extend molecular scaffolds with structural motifs. In: International Conference on Machine Learning (2021)
Maziarz, K., et al.: Learning to extend molecular scaffolds with structural motifs. In: International Conference on Learning Representations (2022)
Morris, C., Kriege, N.M., Bause, F., Kersting, K., Mutzel, P., Neumann, M.: Tudataset: a collection of benchmark datasets for learning with graphs. arXiv preprint arXiv:2007.08663 (2020)
Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs. In: International Conference on Machine Learning, pp. 2014–2023. PMLR (2016)
Phan, A.V., Le Nguyen, M., Nguyen, Y.L.H., Bui, L.T.: DGCNN: a convolutional neural network over large-scale labeled graphs. Neural Netw. 108, 533–543 (2018)
Shen, J., Nicolaou, C.A.: Molecular property prediction: recent trends in the era of artificial intelligence. Drug Discov. Today Technol. 32, 29–36 (2019)
Shervashidze, N., Schweitzer, P., Van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: Weisfeiler-Lehman graph kernels. J. Mach. Learn. Res. 12(9) (2011)
Subramonian, A.: Motif-driven contrastive learning of graph representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 15980–15981 (2021)
Sun, M., Zhao, S., Gilvary, C., Elemento, O., Zhou, J., Wang, F.: Graph convolutional networks for computational drug development and discovery. Brief. Bioinform. 21(3), 919–935 (2020)
Toivonen, H., Srinivasan, A., King, R.D., Kramer, S., Helma, C.: Statistical evaluation of the predictive toxicology challenge 2000–2001. Bioinformatics 19(10), 1183–1193 (2003)
Wale, N., Watson, I.A., Karypis, G.: Comparison of descriptor spaces for chemical compound retrieval and classification. Knowl. Inf. Syst. 14, 347–375 (2008)
Wernicke, S.: Efficient detection of network motifs. IEEE/ACM Trans. Comput. Biol. Bioinf. 3(4), 347–359 (2006)
Wu, F., Radev, D., Li, S.Z.: Molformer: motif-based transformer on 3D heterogeneous molecular graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 5312–5320 (2023)
Wu, T., Tang, Y., Sun, Q., Xiong, L.: Molecular joint representation learning via multi-modal information. arXiv preprint arXiv:2211.14042 (2022)
Xinyi, Z., Chen, L.: Capsule graph neural network. In: International Conference on Learning Representations (2018)
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: International Conference on Learning Representations (2018)
Xu, Z., Wang, S., Zhu, F., Huang, J.: Seq2seq fingerprint: an unsupervised deep molecular embedding for drug discovery. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 285–294 (2017)
Xue, L., Bajorath, J.: Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Comb. Chem. High Throughput Screening 3(5), 363–372 (2000)
Yang, N., Zeng, K., Wu, Q., Jia, X., Yan, J.: Learning substructure invariance for out-of-distribution molecular representations. In: Advances in Neural Information Processing Systems, vol. 35, pp. 12964–12978 (2022)
Yu, W., Chen, S., Gong, C., Niu, G., Sugiyama, M.: Atom-motif contrastive transformer for molecular property prediction. arXiv preprint arXiv:2310.07351 (2023)
Yu, Z., Gao, H.: Molecular representation learning via heterogeneous motif graph neural networks. In: International Conference on Machine Learning, pp. 25581–25594. PMLR (2022)
Zhang, X., et al.: Artificial intelligence for science in quantum, atomistic, and continuum systems. arXiv preprint arXiv:2307.08423 (2023)
Zhang, Z., Liu, Q., Wang, H., Lu, C., Lee, C.K.: Motif-based graph self-supervised learning for molecular property prediction. In: Advances in Neural Information Processing Systems, vol. 34, pp. 15870–15882 (2021)
Zhu, J., et al.: O-GNN: incorporating ring priors into molecular modeling. In: The Eleventh International Conference on Learning Representations (2023)
Acknowledgments
The authors would like to thank the editors and reviewers for their valuable comments. This work is supported by the CAS Project for Young Scientists in Basic Research (Grant No. YSBR-040) and the National Natural Science Foundation of China (Grant No. 62372439), the Natural Science Foundation of Beijing, China (Grant No. 4232038), the Basic Research Program of ISCAS (Grant No. ISCAS-JCMS-202307).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yao, C., Huang, H., Gao, H., Wu, F., Chen, H., Zhao, J. (2024). Molecular Graph Representation Learning via Structural Similarity Information. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14943. Springer, Cham. https://doi.org/10.1007/978-3-031-70352-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-70352-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70351-5
Online ISBN: 978-3-031-70352-2
eBook Packages: Computer ScienceComputer Science (R0)