{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,9]],"date-time":"2024-09-09T15:57:39Z","timestamp":1725897459322},"reference-count":60,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,11,21]],"date-time":"2023-11-21T00:00:00Z","timestamp":1700524800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,11,21]],"date-time":"2023-11-21T00:00:00Z","timestamp":1700524800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"Abstract<\/jats:title>While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant<\/jats:italic> validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then often biased by the human selection process. In this case study, we frame retrospective validation as the ability to mimic human drug design, by answering the following question: Can a generative model trained on early-stage project compounds generate middle\/late-stage compounds <\/jats:italic>de novo? To this end, we used experimental data that contains the elapsed time of a synthetic expansion following hit identification from five public (where the time series was pre-processed to better reflect realistic synthetic expansions) and six in-house project datasets, and used REINVENT as a widely adopted RNN-based generative model. After splitting the dataset and training REINVENT on early-stage compounds, we found that rediscovery of middle\/late-stage compounds was much higher in public projects (at 1.60%, 0.64%, and 0.21% of the top 100, 500, and 5000 scored generated compounds) than in in-house projects (where the values were 0.00%, 0.03%, and 0.04%, respectively). Similarly, average single nearest neighbour similarity between early- and middle\/late-stage compounds in public projects was higher between active compounds than inactive compounds; however, for in-house projects the converse was true, which makes rediscovery (if so desired) more difficult. We hence show that the generative model recovers very few middle\/late-stage compounds from real-world drug discovery projects, highlighting the fundamental difference between purely algorithmic design and drug discovery as a real-world process. Evaluating de<\/jats:italic> novo compound design approaches appears, based on the current study, difficult or even impossible to do retrospectively.<\/jats:p>Scientific Contribution<\/jats:bold> This contribution hence illustrates aspects of evaluating the performance of generative models in a real-world setting which have not been extensively described previously and which hopefully contribute to their further future development.<\/jats:p>","DOI":"10.1186\/s13321-023-00781-1","type":"journal-article","created":{"date-parts":[[2023,11,21]],"date-time":"2023-11-21T19:09:05Z","timestamp":1700593745000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["On the difficulty of validating molecular generative models realistically: a case study on public and proprietary data"],"prefix":"10.1186","volume":"15","author":[{"given":"Koichi","family":"Handa","sequence":"first","affiliation":[]},{"given":"Morgan C.","family":"Thomas","sequence":"additional","affiliation":[]},{"given":"Michiharu","family":"Kageyama","sequence":"additional","affiliation":[]},{"given":"Takeshi","family":"Iijima","sequence":"additional","affiliation":[]},{"given":"Andreas","family":"Bender","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,11,21]]},"reference":[{"key":"781_CR1","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","volume":"4","author":"R G\u00f3mez-Bombarelli","year":"2018","unstructured":"G\u00f3mez-Bombarelli R et al (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4:268\u2013276","journal-title":"ACS Cent Sci"},{"key":"781_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-1-0716-1787-8_1","volume":"2390","author":"M Thomas","year":"2022","unstructured":"Thomas M et al (2022) Applications of artificial intelligence in drug design: opportunities and challenges. Methods Mol Bio 2390:1\u201359","journal-title":"Methods Mol Bio"},{"key":"781_CR3","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0147215","volume":"11","author":"JW Scannell","year":"2016","unstructured":"Scannell JW, Bosley J (2016) When quality beats quantity: decision theory, drug discovery, and the reproducibility crisis. PLoS ONE 11:e0147215","journal-title":"PLoS ONE"},{"key":"781_CR4","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1016\/j.drudis.2011.09.012","volume":"17","author":"AT Plowright","year":"2012","unstructured":"Plowright AT et al (2012) Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle. Drug Discovery Today 17:56\u201362","journal-title":"Drug Discovery Today"},{"key":"781_CR5","first-page":"101","volume":"236","author":"DJ Danziger","year":"1989","unstructured":"Danziger DJ, Dean PM (1989) Automated site-directed drug design: a general algorithm for knowledge acquisition about hydrogen-bonding regions at protein surfaces. Proceed Royal Soc London Series B Bio Sci 236:101\u2013113","journal-title":"Proceed Royal Soc London Series B Bio Sci"},{"key":"781_CR6","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1023\/A:1008108423895","volume":"14","author":"D Douguet","year":"2000","unstructured":"Douguet D, Thoreau E, Grassy G (2000) A genetic algorithm for the automated generation of small organic molecules: drug design using an evolutionary algorithm. J Comput Aided Mol Des 14:449\u2013466","journal-title":"J Comput Aided Mol Des"},{"key":"781_CR7","doi-asserted-by":"publisher","first-page":"487","DOI":"10.1023\/A:1008184403558","volume":"14","author":"G Schneider","year":"2000","unstructured":"Schneider G, Lee ML, Stahl M, Schneider P (2000) De novo design of molecular architectures by evolutionary assembly of drug-derived building blocks. J Comput Aided Mol Des 14:487\u2013494","journal-title":"J Comput Aided Mol Des"},{"key":"781_CR8","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1038\/s42256-022-00463-x","volume":"4","author":"M Pandey","year":"2022","unstructured":"Pandey M et al (2022) The transformational role of GPU computing and deep learning in drug discovery. Nature Machine Intelligence 4:211\u2013221","journal-title":"Nature Machine Intelligence"},{"key":"781_CR9","doi-asserted-by":"publisher","first-page":"579","DOI":"10.1080\/17460441.2018.1465407","volume":"13","author":"E Gawehn","year":"2018","unstructured":"Gawehn E, Hiss JA, Brown JB, Schneider G (2018) Advancing drug discovery via GPU-based deep learning. Expert Opin Drug Discov 13:579\u2013582","journal-title":"Expert Opin Drug Discov"},{"key":"781_CR10","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1038\/s41573-019-0024-5","volume":"18","author":"J Vamathevan","year":"2019","unstructured":"Vamathevan J et al (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discovery 18:463\u2013477","journal-title":"Nat Rev Drug Discovery"},{"key":"781_CR11","volume":"3","author":"M Vogt","year":"2023","unstructured":"Vogt M (2023) Exploring chemical space\u2014Generative models and their evaluation. Artifi Int Life Sci 3:100064","journal-title":"Artifi Int Life Sci"},{"key":"781_CR12","doi-asserted-by":"publisher","DOI":"10.3389\/fphar.2020.565644","volume":"11","author":"D Polykovskiy","year":"2020","unstructured":"Polykovskiy D et al (2020) Molecular sets (MOSES): a benchmarking platform for molecular generation models. Front Pharmacol 11:565644","journal-title":"Front Pharmacol"},{"key":"781_CR13","doi-asserted-by":"publisher","first-page":"1736","DOI":"10.1021\/acs.jcim.8b00234","volume":"58","author":"K Preuer","year":"2018","unstructured":"Preuer K, Renz P, Unterthiner T, Hochreiter S, Klambauer G (2018) Fr\u00e9chet ChemNet distance: a metric for generative models for molecules in drug discovery. J Chem Inf Model 58:1736\u20131741","journal-title":"J Chem Inf Model"},{"key":"781_CR14","doi-asserted-by":"publisher","first-page":"428","DOI":"10.1038\/s41570-022-00391-9","volume":"6","author":"A Bender","year":"2022","unstructured":"Bender A et al (2022) Evaluation guidelines for machine learning tools in the chemical sciences. Nat Rev Chem 6:428\u2013442","journal-title":"Nat Rev Chem"},{"key":"781_CR15","unstructured":"https:\/\/cache-challenge.org\/ (access date: December 2nd, 2022)"},{"key":"781_CR16","doi-asserted-by":"publisher","first-page":"1096","DOI":"10.1021\/acs.jcim.8b00839","volume":"59","author":"N Brown","year":"2019","unstructured":"Brown N, Fiscato M, Segler MHS, Vaucher AC (2019) GuacaMol: benchmarking models for de novo molecular design. J Chem Inf Model 59:1096\u20131108","journal-title":"J Chem Inf Model"},{"key":"781_CR17","doi-asserted-by":"publisher","first-page":"D945","DOI":"10.1093\/nar\/gkw1074","volume":"45","author":"A Gaulton","year":"2017","unstructured":"Gaulton A et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45:D945\u2013D954","journal-title":"Nucleic Acids Res"},{"key":"781_CR18","unstructured":"Thomas M, O\u2019Boyle NM, Bender A, De Graaf C (2022) Re-evaluating sample efficiency in de novo molecule generation. https:\/\/arxiv.org\/abs\/2212.01385."},{"key":"781_CR19","doi-asserted-by":"publisher","first-page":"783","DOI":"10.1021\/ci400084k","volume":"53","author":"RP Sheridan","year":"2013","unstructured":"Sheridan RP (2013) Time-split cross-validation as a method for estimating the goodness of prospective prediction. J Chem Inf Model 53:783\u2013790","journal-title":"J Chem Inf Model"},{"key":"781_CR20","doi-asserted-by":"publisher","first-page":"1040","DOI":"10.1016\/j.drudis.2020.11.037","volume":"26","author":"A Bender","year":"2021","unstructured":"Bender A, Cortes-Ciriano I (2021) Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 2: a discussion of chemical and biological data. Drug Discovery Today 26:1040\u20131052","journal-title":"Drug Discovery Today"},{"key":"781_CR21","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jcim.2c00785","author":"M Beckers","year":"2022","unstructured":"Beckers M, Fechner N, Stiefl N (2022) 25 years of small-molecule optimization at novartis: a retrospective analysis of chemical series evolution. J Chem Inf Model. https:\/\/doi.org\/10.1021\/acs.jcim.2c00785","journal-title":"J Chem Inf Model"},{"key":"781_CR22","doi-asserted-by":"publisher","first-page":"3166","DOI":"10.1021\/acs.jcim.9b00325","volume":"59","author":"N St\u00e5hl","year":"2019","unstructured":"St\u00e5hl N, Falkman G, Karlsson A, Mathiason G, Bostr\u00f6m J (2019) Deep reinforcement learning for multiparameter optimization in de novo drug design. J Chem Inf Model 59:3166\u20133176","journal-title":"J Chem Inf Model"},{"key":"781_CR23","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1186\/s13321-021-00497-0","volume":"13","author":"J He","year":"2021","unstructured":"He J et al (2021) Molecular optimization by capturing chemist\u2019s intuition using deep neural networks. J Cheminformat 13:26","journal-title":"J Cheminformat"},{"key":"781_CR24","doi-asserted-by":"publisher","first-page":"198","DOI":"10.1016\/j.drudis.2008.10.007","volume":"14","author":"J Delaney","year":"2009","unstructured":"Delaney J (2009) Modelling iterative compound optimisation using a self-avoiding walk. Drug Discov Today 14:198\u2013207","journal-title":"Drug Discov Today"},{"key":"781_CR25","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1186\/s13321-017-0235-x","volume":"9","author":"M Olivecrona","year":"2017","unstructured":"Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de-novo design through deep reinforcement learning. J Cheminformat 9:48","journal-title":"J Cheminformat"},{"key":"781_CR26","doi-asserted-by":"publisher","first-page":"5918","DOI":"10.1021\/acs.jcim.0c00915","volume":"60","author":"T Blaschke","year":"2020","unstructured":"Blaschke T et al (2020) REINVENT 2.0: an ai tool for de novo drug design. J Chem Inf Model 60:5918\u20135922","journal-title":"J Chem Inf Model"},{"key":"781_CR27","doi-asserted-by":"publisher","first-page":"7885","DOI":"10.1126\/sciadv.aap7885","volume":"4","author":"M Popova","year":"2018","unstructured":"Popova M, Isayev O, Tropsha A (2018) Deep reinforcement learning for de novo drug design. Sci Adv 4:7885","journal-title":"Sci Adv"},{"key":"781_CR28","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1166\/jctn.2020.8648","volume":"17","author":"M Sewak","year":"2020","unstructured":"Sewak M, Sahay SK, Rathore H (2020) An overview of deep learning architecture of deep neural networks and autoencoders. J Comput Theor Nanosci 17:182\u2013188","journal-title":"J Comput Theor Nanosci"},{"key":"781_CR29","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1016\/j.neunet.2019.04.024","volume":"117","author":"T Bouwmans","year":"2019","unstructured":"Bouwmans T, Javed S, Sultana M, Jung SK (2019) Deep neural network concepts for background subtraction: a systematic review and comparative evaluation. Neural Networ 117:8\u201366","journal-title":"Neural Networ"},{"key":"781_CR30","doi-asserted-by":"publisher","first-page":"595","DOI":"10.1007\/s10822-016-9938-8","volume":"30","author":"S Kearnes","year":"2016","unstructured":"Kearnes S, McCloskey K, Berndl M, Pande V, Riley P (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595\u2013608","journal-title":"J Comput Aided Mol Des"},{"key":"781_CR31","first-page":"11973","volume":"1805","author":"T De Cao","year":"2018","unstructured":"De Cao T, Kipf T (2018) MolGAN: an implicit generative model for small molecular graphs. arXiv 1805:11973","journal-title":"arXiv"},{"key":"781_CR32","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735\u20131780","journal-title":"Neural Comput"},{"key":"781_CR33","unstructured":"Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. 1\u20139."},{"key":"781_CR34","first-page":"03762","volume":"1706","author":"A Vaswani","year":"2017","unstructured":"Vaswani A et al (2017) Attention is all you need. arXiv 1706:03762","journal-title":"arXiv"},{"key":"781_CR35","first-page":"07449","volume":"1712","author":"P Ertl","year":"2017","unstructured":"Ertl P, Lewis R, Martin E, Polyakov V (2017) In silico generation of novel, drug-like chemical matter using the LSTM neural network. arXiv 1712:07449","journal-title":"arXiv"},{"key":"781_CR36","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1186\/s13321-022-00599-3","volume":"14","author":"J He","year":"2022","unstructured":"He J et al (2022) Transformer-based molecular optimization beyond matched molecular pairs. J Cheminformat 14:18","journal-title":"J Cheminformat"},{"key":"781_CR37","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1186\/s13321-021-00563-7","volume":"13","author":"J Guo","year":"2021","unstructured":"Guo J et al (2021) DockStream: a docking wrapper to enhance de novo molecular design. J Cheminformat 13:89","journal-title":"J Cheminformat"},{"key":"781_CR38","doi-asserted-by":"publisher","first-page":"7331","DOI":"10.1021\/acs.jpca.1c04587","volume":"125","author":"G Marques","year":"2021","unstructured":"Marques G et al (2021) De Novo design of molecules with low hole reorganization energy based on a quarter-million molecule DFT screen. J Phys Chem A 125:7331\u20137343","journal-title":"J Phys Chem A"},{"key":"781_CR39","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1186\/s13321-021-00516-0","volume":"13","author":"M Thomas","year":"2021","unstructured":"Thomas M, Smith RT, O\u2019Boyle NM, de Graaf C, Bender A (2021) Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study. J Cheminformat 13:39","journal-title":"J Cheminformat"},{"key":"781_CR40","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1186\/s13321-022-00646-z","volume":"14","author":"M Thomas","year":"2022","unstructured":"Thomas M, O\u2019Boyle NM, Bender A, de Graaf C (2022) Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation. J Cheminformat 14:68","journal-title":"J Cheminformat"},{"key":"781_CR41","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1007\/s10822-021-00392-8","volume":"36","author":"T Blaschke","year":"2022","unstructured":"Blaschke T, Bajorath J (2022) Fine-tuning of a generative neural network for designing multi-target compounds. J Comput Aided Mol Des 36:363\u2013371","journal-title":"J Comput Aided Mol Des"},{"key":"781_CR42","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1186\/s13321-020-00473-0","volume":"12","author":"T Blaschke","year":"2020","unstructured":"Blaschke T, Engkvist O, Bajorath J, Chen H (2020) Memory-assisted reinforcement learning for diverse molecular de novo design. J Cheminformat 12:68","journal-title":"J Cheminformat"},{"key":"781_CR43","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1248\/cpb.c19-00625","volume":"68","author":"A Yoshimori","year":"2020","unstructured":"Yoshimori A, Kawasaki E, Kanai C, Tasaka T (2020) Strategies for design of molecular structures with a desired pharmacophore using deep reinforcement learning. Chem Pharm Bull 68:227\u2013233","journal-title":"Chem Pharm Bull"},{"key":"781_CR44","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1186\/s13321-017-0203-5","volume":"9","author":"J Sun","year":"2017","unstructured":"Sun J et al (2017) ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics. J Cheminformat 9:17","journal-title":"J Cheminformat"},{"key":"781_CR45","doi-asserted-by":"publisher","first-page":"D10","DOI":"10.1093\/nar\/gkaa892","volume":"49","author":"EW Sayers","year":"2021","unstructured":"Sayers EW et al (2021) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 49:D10\u2013D17. https:\/\/doi.org\/10.1093\/nar\/gkaa892","journal-title":"Nucleic Acids Res"},{"key":"781_CR46","doi-asserted-by":"publisher","first-page":"460","DOI":"10.1021\/ci500588j","volume":"55","author":"T Sander","year":"2015","unstructured":"Sander T, Freyss J, von Korff M, Rufener C (2015) DataWarrior: an open-source program for chemistry aware data visualization and analysis. J Chem Inf Model 55:460\u2013473","journal-title":"J Chem Inf Model"},{"key":"781_CR47","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1186\/s13321-015-0061-y","volume":"7","author":"P Ertl","year":"2015","unstructured":"Ertl P, Patiny L, Sander T, Rufener C, Zasso M (2015) Wikipedia chemical structure explorer: substructure and similarity searching of molecules from Wikipedia. J Cheminformat 7:10","journal-title":"J Cheminformat"},{"key":"781_CR48","unstructured":"RD-kit: https:\/\/www.rdkit.org\/docs\/index.html# Access 5 June 2023"},{"key":"781_CR49","doi-asserted-by":"publisher","first-page":"5343","DOI":"10.1021\/acs.jcim.0c01496","volume":"61","author":"T Sousa","year":"2021","unstructured":"Sousa T, Correia J, Pereira V, Rocha M (2021) Generative deep learning for targeted compound design. J Chem Inf Model 61:5343\u20135361","journal-title":"J Chem Inf Model"},{"key":"781_CR50","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1021\/ci100050t","volume":"50","author":"D Rogers","year":"2010","unstructured":"Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742\u2013754","journal-title":"J Chem Inf Model"},{"key":"781_CR51","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Random forests. Mach Learn 45:5\u201332","journal-title":"Mach Learn"},{"key":"781_CR52","first-page":"14500","volume":"2203","author":"Y Du","year":"2022","unstructured":"Du Y, Fu T, Sun J, Liu S (2022) MolGenSurvey: a systematic survey in machine learning models for molecule design. arXiv 2203:14500","journal-title":"arXiv"},{"key":"781_CR53","doi-asserted-by":"publisher","first-page":"373","DOI":"10.1007\/s10822-023-00512-6","volume":"37","author":"EJ Bjerrum","year":"2023","unstructured":"Bjerrum EJ, Margreitter C, Blaschke T, de Castro RL-R (2023) Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES. J Comput Aided Mol Des 37:373\u2013394","journal-title":"J Comput Aided Mol Des"},{"key":"781_CR54","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1021\/acscentsci.7b00512","volume":"4","author":"MHS Segler","year":"2018","unstructured":"Segler MHS, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4:120\u2013131","journal-title":"ACS Cent Sci"},{"key":"781_CR55","doi-asserted-by":"publisher","first-page":"4863","DOI":"10.1021\/acs.jcim.2c00838","volume":"62","author":"SR Atance","year":"2022","unstructured":"Atance SR, Diez JV, Engkvist O, Olsson S, Mercado R (2022) De Novo drug design using reinforcement learning with graph-based deep generative models. J Chem Inf Model 62:4863\u20134872","journal-title":"J Chem Inf Model"},{"key":"781_CR56","doi-asserted-by":"publisher","DOI":"10.12688\/f1000research.8357.2","author":"S Jasial","year":"2016","unstructured":"Jasial S, Hu Y, Vogt M, Bajorath J (2016) Activity-relevant similarity values for fingerprints and implications for similarity searching. F1000Research. https:\/\/doi.org\/10.12688\/f1000research.8357.2","journal-title":"F1000Research"},{"key":"781_CR57","doi-asserted-by":"publisher","first-page":"3256","DOI":"10.1039\/b409865j","volume":"2","author":"J Hert","year":"2004","unstructured":"Hert J et al (2004) Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org Biomol Chem 2:3256\u20133266","journal-title":"Org Biomol Chem"},{"key":"781_CR58","doi-asserted-by":"publisher","first-page":"2623","DOI":"10.1021\/acs.jcim.1c00160","volume":"28","author":"C Esposito","year":"2021","unstructured":"Esposito C, Landrum GA, Schneider N, Stiefl N, Riniker S (2021) GHOST: adjusting the decision threshold to handle imbalanced data in machine learning. J Chem Inf Model 28:2623\u20132640","journal-title":"J Chem Inf Model"},{"key":"781_CR59","doi-asserted-by":"publisher","first-page":"1194","DOI":"10.1021\/acs.jcim.7b00690","volume":"58","author":"E Putin","year":"2018","unstructured":"Putin E et al (2018) Reinforced adversarial neural computer for de novo molecular design. J Chem Inf Model 58:1194\u20131204","journal-title":"J Chem Inf Model"},{"key":"781_CR60","doi-asserted-by":"publisher","first-page":"5107","DOI":"10.1021\/acs.jcim.3c00963","volume":"63","author":"G Lamanna","year":"2023","unstructured":"Lamanna G et al (2023) GENERA: a combined genetic\/deep-learning algorithm for multiobjective target-oriented de novo design. J Chem Inf Model 63:5107\u20135119","journal-title":"J Chem Inf Model"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00781-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-023-00781-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00781-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,22]],"date-time":"2023-11-22T09:27:04Z","timestamp":1700645224000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-023-00781-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,21]]},"references-count":60,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["781"],"URL":"https:\/\/doi.org\/10.1186\/s13321-023-00781-1","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,21]]},"assertion":[{"value":"6 July 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 November 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 November 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"112"}}