Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions
- PMID: 28182403
- DOI: 10.1021/acs.accounts.6b00491
Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions
Abstract
In structure-based drug design, scoring functions are widely used for fast evaluation of protein-ligand interactions. They are often applied in combination with molecular docking and de novo design methods. Since the early 1990s, a whole spectrum of protein-ligand interaction scoring functions have been developed. Regardless of their technical difference, scoring functions all need data sets combining protein-ligand complex structures and binding affinity data for parametrization and validation. However, data sets of this kind used to be rather limited in terms of size and quality. On the other hand, standard metrics for evaluating scoring function used to be ambiguous. Scoring functions are often tested in molecular docking or even virtual screening trials, which do not directly reflect the genuine quality of scoring functions. Collectively, these underlying obstacles have impeded the invention of more advanced scoring functions. In this Account, we describe our long-lasting efforts to overcome these obstacles, which involve two related projects. On the first project, we have created the PDBbind database. It is the first database that systematically annotates the protein-ligand complexes in the Protein Data Bank (PDB) with experimental binding data. This database has been updated annually since its first public release in 2004. The latest release (version 2016) provides binding data for 16 179 biomolecular complexes in PDB. Data sets provided by PDBbind have been applied to many computational and statistical studies on protein-ligand interaction and various subjects. In particular, it has become a major data resource for scoring function development. On the second project, we have established the Comparative Assessment of Scoring Functions (CASF) benchmark for scoring function evaluation. Our key idea is to decouple the "scoring" process from the "sampling" process, so scoring functions can be tested in a relatively pure context to reflect their quality. In our latest work on this track, i.e. CASF-2013, the performance of a scoring function was quantified in four aspects, including "scoring power", "ranking power", "docking power", and "screening power". All four performance tests were conducted on a test set containing 195 high-quality protein-ligand complexes selected from PDBbind. A panel of 20 standard scoring functions were tested as demonstration. Importantly, CASF is designed to be an open-access benchmark, with which scoring functions developed by different researchers can be compared on the same grounds. Indeed, it has become a popular choice for scoring function validation in recent years. Despite the considerable progress that has been made so far, the performance of today's scoring functions still does not meet people's expectations in many aspects. There is a constant demand for more advanced scoring functions. Our efforts have helped to overcome some obstacles underlying scoring function development so that the researchers in this field can move forward faster. We will continue to improve the PDBbind database and the CASF benchmark in the future to keep them as useful community resources.
Similar articles
-
Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results.J Chem Inf Model. 2014 Jun 23;54(6):1717-36. doi: 10.1021/ci500081m. Epub 2014 Jun 2. J Chem Inf Model. 2014. PMID: 24708446
-
Comparative Assessment of Scoring Functions: The CASF-2016 Update.J Chem Inf Model. 2019 Feb 25;59(2):895-913. doi: 10.1021/acs.jcim.8b00545. Epub 2018 Dec 11. J Chem Inf Model. 2019. PMID: 30481020
-
Comparative assessment of scoring functions on an updated benchmark: 1. Compilation of the test set.J Chem Inf Model. 2014 Jun 23;54(6):1700-16. doi: 10.1021/ci500080q. Epub 2014 Jun 2. J Chem Inf Model. 2014. PMID: 24716849
-
Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions.J Chem Inf Model. 2022 Jun 13;62(11):2696-2712. doi: 10.1021/acs.jcim.2c00485. Epub 2022 May 17. J Chem Inf Model. 2022. PMID: 35579568 Free PMC article. Review.
-
Scoring functions for prediction of protein-ligand interactions.Curr Pharm Des. 2013;19(12):2174-82. doi: 10.2174/1381612811319120005. Curr Pharm Des. 2013. PMID: 23016847 Review.
Cited by
-
ProBID-Net: a deep learning model for protein-protein binding interface design.Chem Sci. 2024 Oct 30;15(47):19977-19990. doi: 10.1039/d4sc02233e. eCollection 2024 Dec 4. Chem Sci. 2024. PMID: 39568891 Free PMC article.
-
Predicting Protein-Ligand Binding Affinity Using Fusion Model of Spatial-Temporal Graph Neural Network and 3D Structure-Based Complex Graph.Interdiscip Sci. 2024 Nov 14. doi: 10.1007/s12539-024-00644-9. Online ahead of print. Interdiscip Sci. 2024. PMID: 39541085
-
Comparative evaluation of methods for the prediction of protein-ligand binding sites.J Cheminform. 2024 Nov 11;16(1):126. doi: 10.1186/s13321-024-00923-z. J Cheminform. 2024. PMID: 39529176 Free PMC article.
-
QuickBind: A Light-Weight And Interpretable Molecular Docking Model.ArXiv [Preprint]. 2024 Oct 21:arXiv:2410.16474v1. ArXiv. 2024. PMID: 39502889 Free PMC article. Preprint.
-
Accurate prediction of protein-ligand interactions by combining physical energy functions and graph-neural networks.J Cheminform. 2024 Nov 4;16(1):121. doi: 10.1186/s13321-024-00912-2. J Cheminform. 2024. PMID: 39497201 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources