{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,22]],"date-time":"2025-03-22T12:20:02Z","timestamp":1742646002821},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,9,19]],"date-time":"2023-09-19T00:00:00Z","timestamp":1695081600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,19]],"date-time":"2023-09-19T00:00:00Z","timestamp":1695081600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"Abstract<\/jats:title>Generative models are frequently used for de novo design in drug discovery projects to propose new molecules. However, the question of whether or not the generated molecules can be synthesized is not systematically taken into account during generation, even though being able to synthesize the generated molecules is a fundamental requirement for such methods to be useful in practice. Methods have been developed to estimate molecule \u201csynthesizability\u201d, but, so far, there is no consensus on whether or not a molecule is synthesizable. In this paper we introduce the Retro-Score (RScore), which computes a synthetic accessibility score of molecules by performing a full retrosynthetic analysis through our data-driven synthetic planning software Spaya, and its dedicated API: Spaya-API (https:\/\/spaya.ai). We start by comparing several synthetic accessibility scores to a binary \u201cchemist score\u201d as estimated by chemists on a bench of generated molecules, as a first experimental validation that the RScore is a reliable synthetic accessibility score. We then describe a pipeline to generate molecules that validate a list of targets while still being easy to synthesize. We further this idea by performing experiments comparing molecular generator outputs across a range of constraints and conditions. We show that the RScore can be learned by a Neural Network, which leads to a new score: RSPred. We demonstrate that using the RScore or RSPred as a constraint during molecular generation enables our molecular generators to produce more synthesizable solutions, with higher diversity. The open-source Python code containing all the scores and the experiments can be found on (https:\/\/github.com\/iktos\/generation-under-synthetic-constraint<\/jats:ext-link>).<\/jats:p>\n Graphic Abstract<\/jats:bold><\/jats:p>","DOI":"10.1186\/s13321-023-00742-8","type":"journal-article","created":{"date-parts":[[2023,9,19]],"date-time":"2023-09-19T11:02:06Z","timestamp":1695121326000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Integrating synthetic accessibility with AI-based generative drug design"],"prefix":"10.1186","volume":"15","author":[{"given":"Maud","family":"Parrot","sequence":"first","affiliation":[]},{"given":"Hamza","family":"Tajmouati","sequence":"additional","affiliation":[]},{"given":"Vinicius Barros Ribeiro","family":"da Silva","sequence":"additional","affiliation":[]},{"given":"Brian Ross","family":"Atwood","sequence":"additional","affiliation":[]},{"given":"Robin","family":"Fourcade","sequence":"additional","affiliation":[]},{"given":"Yann","family":"Gaston-Math\u00e9","sequence":"additional","affiliation":[]},{"given":"Nicolas","family":"Do Huu","sequence":"additional","affiliation":[]},{"given":"Quentin","family":"Perron","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,9,19]]},"reference":[{"key":"742_CR1","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1021\/acscentsci.7b00512","volume":"4","author":"MHS Segler","year":"2018","unstructured":"Segler MHS, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4:120\u2013131","journal-title":"ACS Cent Sci"},{"key":"742_CR2","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv.13622417.v1","author":"Q Perron","year":"2021","unstructured":"Perron Q, Mirguet O, Tajmouati H, Skiredj A, Rojas A, Gohier A, Ducrot P, Bourguignon MP, Sansilvestri-Morel P, Do Huu N et al (2021) Deep generative models for ligand-based de novo design applied to multi-parametric optimization. ChemRxiv. https:\/\/doi.org\/10.26434\/chemrxiv.13622417.v1","journal-title":"ChemRxiv"},{"key":"742_CR3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-017-0235-x","volume":"9","author":"M Olivecrona","year":"2017","unstructured":"Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de novo design through deep reinforcement learning. J Cheminf 9:1\u20134","journal-title":"J Cheminf"},{"key":"742_CR4","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","volume":"4","author":"R G\u00f3mez-Bombarelli","year":"2018","unstructured":"G\u00f3mez-Bombarelli R, Wei JN, Duvenaud D, Hern\u00e1ndez-Lobato JM, S\u00e1nchez-Lengeling B, Sheberla D, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4:268\u2013276","journal-title":"ACS Cent Sci"},{"key":"742_CR5","doi-asserted-by":"publisher","first-page":"1182","DOI":"10.1021\/acs.jcim.8b00751","volume":"59","author":"B Sattarov","year":"2019","unstructured":"Sattarov B, Baskin II, Horvath D, Marcou G, Bjerrum EJ, De Varnek A (2019) Novo molecular design by combining deep autoencoder recurrent neural networks with generative topographic mapping. J Chem Inf Model 59:1182\u20131196","journal-title":"J Chem Inf Model"},{"key":"742_CR6","doi-asserted-by":"publisher","first-page":"5682","DOI":"10.1021\/acs.jcim.0c00599","volume":"60","author":"K Gao","year":"2020","unstructured":"Gao K, Nguyen DD, Tu M, Wei G-W (2020) Generative network complex for the automated generation of drug-like molecules. J Chem Inf Model 60:5682\u20135698","journal-title":"J Chem Inf Model"},{"key":"742_CR7","doi-asserted-by":"publisher","first-page":"8016","DOI":"10.1039\/C9SC01928F","volume":"10","author":"R Winter","year":"2019","unstructured":"Winter R, Montanari F, Steffen A, Briem H, No\u00e9 F, Clevert D-A (2019) Efficient multi-objective molecular optimization in a continuous latent space. Chem Sci 10:8016\u20138024","journal-title":"Chem Sci"},{"key":"742_CR8","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1016\/j.ddtec.2020.09.003","volume":"32\u201333","author":"P Renz","year":"2019","unstructured":"Renz P, Van Rompaey D, Wegner JK, Hochreiter S, Klambauer G (2019) On failure modes in molecule generation and optimization. Drug Discov Today Technol 32\u201333:55\u201363","journal-title":"Drug Discov Today Technol"},{"key":"742_CR9","doi-asserted-by":"publisher","first-page":"1096","DOI":"10.1021\/acs.jcim.8b00839","volume":"59","author":"N Brown","year":"2019","unstructured":"Brown N, Fiscato M, Segler MH, Vaucher AC (2019) GuacaMol: benchmarking models for de novo molecular design. J Chem Inf Model 59:1096\u20131108","journal-title":"J Chem Inf Model"},{"key":"742_CR10","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1906.05221","author":"J Bradshaw","year":"2019","unstructured":"Bradshaw J, Paige B, Kusner MJ, Segler MHS, Hern\u00e1ndez-Lobato JM (2019) A model to search for synthesizable molecules. CoRR. https:\/\/doi.org\/10.48550\/arXiv.1906.05221","journal-title":"CoRR"},{"key":"742_CR11","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2012.11522","author":"J Bradshaw","year":"2020","unstructured":"Bradshaw J, Paige B, Kusner MJ, Segler MHS, Hern\u00e1ndez-Lobato JM (2020) Barking up the right tree: an approach to search over molecule synthesis DAGs. CoRR. https:\/\/doi.org\/10.48550\/arXiv.2012.11522","journal-title":"CoRR"},{"key":"742_CR12","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2011.13042","author":"C Liu","year":"2020","unstructured":"Liu C, Korablyov M, Jastrzebski S, Wlodarczyk-Pruszynski P, Bengio Y, Segler MHS (2020) RetroGNN: approximating retrosynthesis by graph neural networks for de novo drug design. CoRR. https:\/\/doi.org\/10.48550\/arXiv.2011.13042","journal-title":"CoRR"},{"key":"742_CR13","doi-asserted-by":"publisher","first-page":"5714","DOI":"10.1021\/acs.jcim.0c00174","volume":"60","author":"W Gao","year":"2020","unstructured":"Gao W, Coley CW (2020) The synthesizability of molecules proposed by generative models. J Chem Inf Model 60:5714\u20135723","journal-title":"J Chem Inf Model"},{"key":"742_CR14","doi-asserted-by":"publisher","first-page":"948","DOI":"10.1038\/nrd4128","volume":"12","author":"J Cumming","year":"2013","unstructured":"Cumming J, Davis A, Muresan S, Haeberlein M, Chen H (2013) Chemical predictive modelling to improve compound quality. Nat Rev Drug discov 12:948\u201362","journal-title":"Nat Rev Drug discov"},{"key":"742_CR15","doi-asserted-by":"publisher","first-page":"252","DOI":"10.1021\/acs.jcim.7b00622","volume":"58","author":"CW Coley","year":"2018","unstructured":"Coley CW, Rogers L, Green WH, Jensen KF (2018) SCScore: synthetic complexity learned from a reaction corpus. J Chem Inf Model 58:252\u2013261","journal-title":"J Chem Inf Model"},{"key":"742_CR16","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1186\/1758-2946-1-8","volume":"1","author":"P Ertl","year":"2009","unstructured":"Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminf 1:8","journal-title":"J Cheminf"},{"key":"742_CR17","doi-asserted-by":"publisher","first-page":"3339","DOI":"10.1039\/D0SC05401A","volume":"12","author":"A Thakkar","year":"2021","unstructured":"Thakkar A, Chadimov\u00e1 V, Bjerrum EJ, Engkvist O, Reymond J-L (2021) Retrosynthetic accessibility score (RAscore)\u2014rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem Sci 12:3339\u20133349","journal-title":"Chem Sci"},{"key":"742_CR18","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1186\/s13321-020-00472-1","volume":"12","author":"S Genheden","year":"2020","unstructured":"Genheden S, Thakkar A, Chadimov\u00e1 V, Reymond JL, Engkvist O, Bjerrum E (2020) AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning. J Cheminf 12:70","journal-title":"J Cheminf"},{"key":"742_CR19","unstructured":"IKTOS Website Spaya (2023) https:\/\/spaya.ai\/. Accessed 21 Feb 2023"},{"key":"742_CR20","doi-asserted-by":"publisher","first-page":"D930","DOI":"10.1093\/nar\/gky1075","volume":"47","author":"D Mendez","year":"2018","unstructured":"Mendez D et al (2018) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47:D930-40","journal-title":"Nucleic Acids Res"},{"key":"742_CR21","unstructured":"Post-processed ChEMBL datasets. https:\/\/figshare.com\/ projects\/GuacaMol\/56639. Accessed 20 Nov 2018"},{"key":"742_CR22","doi-asserted-by":"publisher","first-page":"550","DOI":"10.1038\/nrc2664","volume":"9","author":"JA Engelman","year":"2009","unstructured":"Engelman JA (2009) Targeting PI3K signalling in cancer: opportunities, challenges and limitations. Nat Rev Cancer 9:550\u2013562","journal-title":"Nat Rev Cancer"},{"key":"742_CR23","doi-asserted-by":"publisher","first-page":"1265","DOI":"10.1517\/13543780903066798","volume":"18","author":"A Carnero","year":"2009","unstructured":"Carnero A (2009) Novel inhibitors of the PI3K family. Expert Opin Investig Drugs 18:1265\u20131277","journal-title":"Expert Opin Investig Drugs"},{"key":"742_CR24","doi-asserted-by":"publisher","first-page":"627","DOI":"10.1038\/nrd2926","volume":"8","author":"P Liu","year":"2009","unstructured":"Liu P et al (2009) Targeting the phosphoinositide 3-kinase pathway in cancer. Nat Rev Drug Discov 8:627\u201364","journal-title":"Nat Rev Drug Discov"},{"key":"742_CR25","unstructured":"Iktos GitHub containing the code reproducing the paper. (2023) https:\/\/github.com\/iktos\/generation-under-synthetic-constraint\/. Accessed 28 Feb 2023"},{"key":"742_CR26","unstructured":"RA score repository (2023) https:\/\/github.com\/reymond-group\/RAscore. Accessed 28 Feb 2023"},{"key":"742_CR27","unstructured":"SC score repository (2023) https:\/\/github.com\/connorcoley\/scscore. Accessed 28 Feb 2023"},{"key":"742_CR28","unstructured":"SA score repository (2023) https:\/\/github.com\/EricTing\/SAscore. Accessed 28 Feb 2023"},{"key":"742_CR29","unstructured":"Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. https:\/\/arxiv.org\/abs\/1502.03167"},{"key":"742_CR30","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929\u20131958","journal-title":"J Mach Learn Res"},{"key":"742_CR31","unstructured":"Kingma D, Ba J (2014) Adam: a method for stochastic optimization. In: International Conference on Learning Representations"},{"key":"742_CR32","unstructured":"BenevolantAI Guacamol github. (2023) https:\/\/github.com\/BenevolentAI\/guacamol\/. Accessed 3 Mar 2023"},{"key":"742_CR33","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1016\/S0022-2496(02)00028-7","volume":"47","author":"IJ Myung","year":"2003","unstructured":"Myung IJ (2003) Tutorial on maximum likelihood estimation. J Math Psychol 47:90\u2013100","journal-title":"J Math Psychol"},{"key":"742_CR34","unstructured":"Lamb A, Goyal A, Zhang Y, Zhang S, Courville A, Bengio Y (2016) Professor forcing: a new algorithm for training recurrent networks. https:\/\/arxiv.org\/abs\/1610.09038 [stat.ML]"},{"key":"742_CR35","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1038\/nchem.1243","volume":"4","author":"R Bickerton","year":"2012","unstructured":"Bickerton R, Paolini G, Besnard J, Muresan S, Hopkins A (2012) Quantifying the chemical beauty of drugs. Nat Chem 4:90\u20138","journal-title":"Nat Chem"},{"key":"742_CR36","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1007\/978-1-4419-9863-7_209","volume-title":"Encyclopedia of systems biology","author":"F Melo","year":"2013","unstructured":"Melo F (2013) Encyclopedia of systems biology. Springer, New York, pp 38\u201339"},{"key":"742_CR37","doi-asserted-by":"publisher","first-page":"2887","DOI":"10.1021\/jm9602928","volume":"39","author":"GW Bemis","year":"1996","unstructured":"Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887\u20132893","journal-title":"J Med Chem"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00742-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-023-00742-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00742-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,20]],"date-time":"2023-11-20T18:19:56Z","timestamp":1700504396000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-023-00742-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,19]]},"references-count":37,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["742"],"URL":"https:\/\/doi.org\/10.1186\/s13321-023-00742-8","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-2521462\/v1","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,19]]},"assertion":[{"value":"27 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 August 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 September 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors are employees at IKTOS. The authors declare no competing interests in relationship with this manuscript.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"83"}}