Molecular Graph Representation Learning via Structural Similarity Information | SpringerLink
Skip to main content

Molecular Graph Representation Learning via Structural Similarity Information

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14943))

Abstract

Graph Neural Networks (GNNs) have been widely employed for feature representation learning in molecular graphs. Therefore, it is crucial to enhance the expressiveness of feature representation to ensure the effectiveness of GNNs. However, a significant portion of current research primarily focuses on the structural features within individual molecules, often overlooking the structural similarity between molecules, which is a crucial aspect encapsulating rich information on the relationship between molecular properties and structural characteristics. Thus, these approaches fail to capture the rich semantic information at the molecular structure level. To bridge this gap, we introduce the Molecular Structural Similarity Motif GNN (MSSM-GNN), a novel molecular graph representation learning method that can capture structural similarity information among molecules from a global perspective. In particular, we propose a specially designed graph that leverages graph kernel algorithms to represent the similarity between molecules quantitatively. Subsequently, we employ GNNs to learn feature representations from molecular graphs, aiming to enhance the accuracy of property prediction by incorporating additional molecular representation information. Finally, through a series of experiments conducted on both small-scale and large-scale molecular datasets, we demonstrate that our model consistently outperforms eleven state-of-the-art baselines. The codes are available at https://github.com/yaoyao-yaoyao-cell/MSSM-GNN.

C. Yao and H. Huang—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 17159
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10581
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Breaking retrosynthetically interesting chemical substructures [10].

  2. 2.

    A six-membered ring compound containing two nitrogen atoms.

References

  1. Atsango, A., Diamant, N.L., Lu, Z., Biancalani, T., Scalia, G., Chuang, K.V.: A 3D-shape similarity-based contrastive approach to molecular representation learning. arXiv preprint arXiv:2211.02130 (2022)

  2. Borgwardt, K.M., Kriegel, H.P.: Shortest-path kernels on graphs. In: Fifth IEEE International Conference on Data Mining (ICDM 2005), pp. 8–pp. IEEE (2005)

    Google Scholar 

  3. Borgwardt, K.M., Ong, C.S., Schönauer, S., Vishwanathan, S., Smola, A.J., Kriegel, H.P.: Protein function prediction via graph kernels. Bioinformatics 21(Suppl_1), i47–i56 (2005)

    Google Scholar 

  4. Bouritsas, G., Frasca, F., Zafeiriou, S., Bronstein, M.M.: Improving graph neural network expressivity via subgraph isomorphism counting. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 657–668 (2022)

    Article  Google Scholar 

  5. Cantoni, V., Gatti, R., Lombardi, L.: Morphological analysis of 3D proteins structure. In: International Conference on Bioinformatics Models, Methods and Algorithms, vol. 2, pp. 15–21. SciTePress (2011)

    Google Scholar 

  6. Chow, W.C.: Brownian bridge. Wiley Interdiscip. Rev. Comput. Stat. 1(3), 325–332 (2009)

    Article  Google Scholar 

  7. Dan, J., et al.: Self-supervision meets kernel graph neural models: from architecture to augmentations. In: 2023 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1076–1083. IEEE (2023)

    Google Scholar 

  8. De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.L.: The mahalanobis distance. Chemometr. Intell. Lab. Syst. 50(1), 1–18 (2000)

    Article  Google Scholar 

  9. Debnath, A.K., Lopez de Compadre, R.L., Debnath, G., Shusterman, A.J., Hansch, C.: Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. J. Med. Chem. 34(2), 786–797 (1991)

    Google Scholar 

  10. Degen, J., Wegscheid-Gerlach, C., Zaliani, A., Rarey, M.: On the art of compiling and using ‘drug-like’ chemical fragment spaces. ChemMedChem Chem. Enabling Drug Discov. 3(10), 1503–1507 (2008)

    Google Scholar 

  11. Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google Scholar 

  12. Gao, H., Ji, S.: Graph u-nets. In: International Conference on Machine Learning, pp. 2083–2092. PMLR (2019)

    Google Scholar 

  13. Geng, Z., et al.: De novo molecular generation via connection-aware motif mining. arXiv preprint arXiv:2302.01129 (2023)

  14. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  15. Hevapathige, A., Wang, Q.: Uplifting the expressive power of graph neural networks through graph partitioning. arXiv preprint arXiv:2312.08671 (2023)

  16. Hu, W., et al.: Open graph benchmark: datasets for machine learning on graphs. In: Advances in Neural Information Processing Systems, vol. 33, pp. 22118–22133 (2020)

    Google Scholar 

  17. Huang, K., Xiao, C., Hoang, T., Glass, L., Sun, J.: Caster: predicting drug interactions with chemical substructure representation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 702–709 (2020)

    Google Scholar 

  18. Inae, E., Liu, G., Jiang, M.: Motif-aware attribute masking for molecular graph pre-training. arXiv preprint arXiv:2309.04589 (2023)

  19. Jin, W., Barzilay, R., Jaakkola, T.: Hierarchical generation of molecular graphs using structural motifs. In: International Conference on Machine Learning, pp. 4839–4848. PMLR (2020)

    Google Scholar 

  20. Kazius, J., McGuire, R., Bursi, R.: Derivation and validation of toxicophores for mutagenicity prediction. J. Med. Chem. 48(1), 312–320 (2005)

    Article  Google Scholar 

  21. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2016)

    Google Scholar 

  22. Kolouri, S., Naderializadeh, N., Rohde, G.K., Hoffmann, H.: Wasserstein embedding for graph learning. arXiv preprint arXiv:2006.09430 (2020)

  23. Liu, Y., Wang, L., Liu, M., Lin, Y., Zhang, X., Oztekin, B., Ji, S.: Spherical message passing for 3D molecular graphs. In: International Conference on Learning Representations (ICLR) (2022)

    Google Scholar 

  24. Maron, H., Ben-Hamu, H., Serviansky, H., Lipman, Y.: Provably powerful graph networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  25. Maziarz, K., et al.: Learning to extend molecular scaffolds with structural motifs. In: International Conference on Machine Learning (2021)

    Google Scholar 

  26. Maziarz, K., et al.: Learning to extend molecular scaffolds with structural motifs. In: International Conference on Learning Representations (2022)

    Google Scholar 

  27. Morris, C., Kriege, N.M., Bause, F., Kersting, K., Mutzel, P., Neumann, M.: Tudataset: a collection of benchmark datasets for learning with graphs. arXiv preprint arXiv:2007.08663 (2020)

  28. Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs. In: International Conference on Machine Learning, pp. 2014–2023. PMLR (2016)

    Google Scholar 

  29. Phan, A.V., Le Nguyen, M., Nguyen, Y.L.H., Bui, L.T.: DGCNN: a convolutional neural network over large-scale labeled graphs. Neural Netw. 108, 533–543 (2018)

    Article  Google Scholar 

  30. Shen, J., Nicolaou, C.A.: Molecular property prediction: recent trends in the era of artificial intelligence. Drug Discov. Today Technol. 32, 29–36 (2019)

    Article  Google Scholar 

  31. Shervashidze, N., Schweitzer, P., Van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: Weisfeiler-Lehman graph kernels. J. Mach. Learn. Res. 12(9) (2011)

    Google Scholar 

  32. Subramonian, A.: Motif-driven contrastive learning of graph representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 15980–15981 (2021)

    Google Scholar 

  33. Sun, M., Zhao, S., Gilvary, C., Elemento, O., Zhou, J., Wang, F.: Graph convolutional networks for computational drug development and discovery. Brief. Bioinform. 21(3), 919–935 (2020)

    Article  Google Scholar 

  34. Toivonen, H., Srinivasan, A., King, R.D., Kramer, S., Helma, C.: Statistical evaluation of the predictive toxicology challenge 2000–2001. Bioinformatics 19(10), 1183–1193 (2003)

    Article  Google Scholar 

  35. Wale, N., Watson, I.A., Karypis, G.: Comparison of descriptor spaces for chemical compound retrieval and classification. Knowl. Inf. Syst. 14, 347–375 (2008)

    Article  Google Scholar 

  36. Wernicke, S.: Efficient detection of network motifs. IEEE/ACM Trans. Comput. Biol. Bioinf. 3(4), 347–359 (2006)

    Article  Google Scholar 

  37. Wu, F., Radev, D., Li, S.Z.: Molformer: motif-based transformer on 3D heterogeneous molecular graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 5312–5320 (2023)

    Google Scholar 

  38. Wu, T., Tang, Y., Sun, Q., Xiong, L.: Molecular joint representation learning via multi-modal information. arXiv preprint arXiv:2211.14042 (2022)

  39. Xinyi, Z., Chen, L.: Capsule graph neural network. In: International Conference on Learning Representations (2018)

    Google Scholar 

  40. Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: International Conference on Learning Representations (2018)

    Google Scholar 

  41. Xu, Z., Wang, S., Zhu, F., Huang, J.: Seq2seq fingerprint: an unsupervised deep molecular embedding for drug discovery. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 285–294 (2017)

    Google Scholar 

  42. Xue, L., Bajorath, J.: Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Comb. Chem. High Throughput Screening 3(5), 363–372 (2000)

    Article  Google Scholar 

  43. Yang, N., Zeng, K., Wu, Q., Jia, X., Yan, J.: Learning substructure invariance for out-of-distribution molecular representations. In: Advances in Neural Information Processing Systems, vol. 35, pp. 12964–12978 (2022)

    Google Scholar 

  44. Yu, W., Chen, S., Gong, C., Niu, G., Sugiyama, M.: Atom-motif contrastive transformer for molecular property prediction. arXiv preprint arXiv:2310.07351 (2023)

  45. Yu, Z., Gao, H.: Molecular representation learning via heterogeneous motif graph neural networks. In: International Conference on Machine Learning, pp. 25581–25594. PMLR (2022)

    Google Scholar 

  46. Zhang, X., et al.: Artificial intelligence for science in quantum, atomistic, and continuum systems. arXiv preprint arXiv:2307.08423 (2023)

  47. Zhang, Z., Liu, Q., Wang, H., Lu, C., Lee, C.K.: Motif-based graph self-supervised learning for molecular property prediction. In: Advances in Neural Information Processing Systems, vol. 34, pp. 15870–15882 (2021)

    Google Scholar 

  48. Zhu, J., et al.: O-GNN: incorporating ring priors into molecular modeling. In: The Eleventh International Conference on Learning Representations (2023)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank the editors and reviewers for their valuable comments. This work is supported by the CAS Project for Young Scientists in Basic Research (Grant No. YSBR-040) and the National Natural Science Foundation of China (Grant No. 62372439), the Natural Science Foundation of Beijing, China (Grant No. 4232038), the Basic Research Program of ISCAS (Grant No. ISCAS-JCMS-202307).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fengge Wu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 98 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yao, C., Huang, H., Gao, H., Wu, F., Chen, H., Zhao, J. (2024). Molecular Graph Representation Learning via Structural Similarity Information. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14943. Springer, Cham. https://doi.org/10.1007/978-3-031-70352-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70352-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70351-5

  • Online ISBN: 978-3-031-70352-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics