{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,12,4]],"date-time":"2024-12-04T18:46:38Z","timestamp":1733337998718,"version":"3.30.1"},"update-to":[{"updated":{"date-parts":[[2021,12,20]],"date-time":"2021-12-20T00:00:00Z","timestamp":1639958400000},"DOI":"10.1371\/journal.pcbi.1009682","type":"new_version","label":"New version"}],"reference-count":24,"publisher":"Public Library of Science (PLoS)","issue":"12","license":[{"start":{"date-parts":[[2021,12,8]],"date-time":"2021-12-08T00:00:00Z","timestamp":1638921600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100014718","name":"Innovative Research Group Project of the National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["31770821; 32071430"],"id":[{"id":"10.13039\/100014718","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"Many computational classifiers have been developed to predict different types of post-translational modification sites. Their performances are measured using cross-validation or independent test, in which experimental data from different sources are mixed and randomly split into training and test sets. However, the self-reported performances of most classifiers based on this measure are generally higher than their performances in the application of new experimental data. It suggests that the cross-validation method overestimates the generalization ability of a classifier. Here, we proposed a generalization estimate method, dubbed experiment-split test, where the experimental sources for the training set are different from those for the test set that simulate the data derived from a new experiment. We took the prediction of lysine methylome (Kme) as an example and developed a deep learning-based Kme site predictor (called DeepKme) with outstanding performance. We assessed the experiment-split test by comparing it with the cross-validation method. We found that the performance measured using the experiment-split test is lower than that measured in terms of cross-validation. As the test data of the experiment-split method were derived from an independent experimental source, this method could reflect the generalization of the predictor. Therefore, we believe that the experiment-split method can be applied to benchmark the practical performance of a given PTM model. DeepKme is free accessible via https:\/\/github.com\/guoyangzou\/DeepKme<\/jats:ext-link>.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1009682","type":"journal-article","created":{"date-parts":[[2021,12,8]],"date-time":"2021-12-08T18:44:29Z","timestamp":1638989069000},"page":"e1009682","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":4,"title":["Development of an experiment-split method for benchmarking the generalization of a PTM site predictor: Lysine methylome as an example"],"prefix":"10.1371","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0736-8759","authenticated-orcid":true,"given":"Guoyang","family":"Zou","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8569-8780","authenticated-orcid":true,"given":"Yang","family":"Zou","sequence":"additional","affiliation":[]},{"given":"Chenglong","family":"Ma","sequence":"additional","affiliation":[]},{"given":"Jiaojiao","family":"Zhao","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0266-8939","authenticated-orcid":true,"given":"Lei","family":"Li","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2021,12,8]]},"reference":[{"key":"pcbi.1009682.ref001","doi-asserted-by":"crossref","first-page":"517","DOI":"10.1038\/nrm.2017.35","article-title":"The winding path of protein methylation research: milestones and new frontiers","volume":"18","author":"J Murn","year":"2017","journal-title":"Nature Reviews Molecular Cell Biology"},{"key":"pcbi.1009682.ref002","first-page":"1","article-title":"Intrinsic Disorder and Prote in Modifications: Building an SVM Predictor for Methylation.","author":"KM Daily","year":"2005","journal-title":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology"},{"key":"pcbi.1009682.ref003","doi-asserted-by":"crossref","first-page":"2525","DOI":"10.1093\/bioinformatics\/bti333","article-title":"AutoMotif server: prediction of single residue post-translational modifications in proteins","volume":"21","author":"D Plewczynski","year":"2005","journal-title":"Bioinformatics"},{"key":"pcbi.1009682.ref004","doi-asserted-by":"crossref","first-page":"2267","DOI":"10.1093\/bib\/bby089","article-title":"Large-scale comparative assessment of computational predictors for lysine post-translational modification sites","volume":"20","author":"Z Chen","year":"2019","journal-title":"Brief Bioinformatics"},{"key":"pcbi.1009682.ref005","doi-asserted-by":"crossref","first-page":"W140","DOI":"10.1093\/nar\/gkaa275","article-title":"MusiteDeep: a deep-learning based webserver for protein post-translational modification site prediction and visualization","volume":"48","author":"D Wang","year":"2020","journal-title":"Nucleic Acids Research"},{"key":"pcbi.1009682.ref006","doi-asserted-by":"crossref","first-page":"e1006494","DOI":"10.1371\/journal.pcbi.1006494","article-title":"Putting benchmarks in their rightful place: The heart of computational biology","volume":"14","author":"B Peters","year":"2018","journal-title":"PLOS Computational Biology"},{"key":"pcbi.1009682.ref007","doi-asserted-by":"crossref","first-page":"e1007967","DOI":"10.1371\/journal.pcbi.1007967","article-title":"Assessing predictors for new post translational modification sites: A case study on hydroxylation.","volume":"16","author":"D Piovesan","year":"2020","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1009682.ref008","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1042\/bse0520165","article-title":"Prediction of lysine post-translational modifications using bioinformatic tools","volume":"52","author":"D. Schwartz","year":"2012","journal-title":"Essays Biochem"},{"key":"pcbi.1009682.ref009","first-page":"647","article-title":"Computational prediction of methylation types of covalently modified lysine and arginine residues in proteins","volume":"18","author":"W Deng","year":"2017","journal-title":"Brief Bioinformatics"},{"key":"pcbi.1009682.ref010","doi-asserted-by":"crossref","first-page":"D542","DOI":"10.1093\/nar\/gkx1104","article-title":"iPTMnet: an integrated resource for protein post-translational modification network discovery","volume":"46","author":"H Huang","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009682.ref011","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1016\/j.jgg.2017.03.007","article-title":"PLMD: An updated data resource of protein lysine modifications","volume":"44","author":"H Xu","year":"2017","journal-title":"Journal of Genetics and Genomics"},{"key":"pcbi.1009682.ref012","doi-asserted-by":"crossref","first-page":"D512","DOI":"10.1093\/nar\/gku1267","article-title":"PhosphoSitePlus, 2014: mutations, PTMs and recalibrations","volume":"43","author":"PV Hornbeck","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009682.ref013","doi-asserted-by":"crossref","first-page":"D298","DOI":"10.1093\/nar\/gky1074","article-title":"dbPTM in 2019: exploring disease association and cross-talk of post-translational modifications","volume":"47","author":"K-Y Huang","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009682.ref014","doi-asserted-by":"crossref","first-page":"D43","DOI":"10.1093\/nar\/gks1068","article-title":"Update on activities at the Universal Protein Resource (UniProt) in 2013","volume":"41","author":"UniProt Consortium","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009682.ref015","doi-asserted-by":"crossref","first-page":"13876","DOI":"10.1021\/acs.analchem.8b02796","article-title":"Affinity Purification of Methyllysine Proteome by Site-Specific Covalent Conjugation","volume":"90","author":"R Wang","year":"2018","journal-title":"Anal Chem"},{"key":"pcbi.1009682.ref016","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"SF Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009682.ref017","doi-asserted-by":"crossref","first-page":"2499","DOI":"10.1093\/bioinformatics\/bty140","article-title":"iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences","volume":"34","author":"Z Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"pcbi.1009682.ref018","doi-asserted-by":"crossref","first-page":"16175","DOI":"10.1038\/s41598-019-52552-4","article-title":"Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method","volume":"9","author":"K-Y Huang","year":"2019","journal-title":"Sci Rep"},{"key":"pcbi.1009682.ref019","first-page":"8","article-title":"DeepCSO: A Deep-Learning Network Approach to Predicting Cysteine S-Sulphenylation Sites.","author":"X Lyu","year":"2020","journal-title":"Front Cell Dev Biol"},{"key":"pcbi.1009682.ref020","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1016\/j.gpb.2018.08.004","article-title":"Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites","volume":"16","author":"Z Chen","year":"2018","journal-title":"Genomics, Proteomics & Bioinformatics"},{"key":"pcbi.1009682.ref021","doi-asserted-by":"crossref","first-page":"1669","DOI":"10.7150\/ijbs.27819","article-title":"BERMP: a cross-species classifier for predicting m6A sites by integrating a deep learning algorithm and a random forest approach","volume":"14","author":"Y Huang","year":"2018","journal-title":"Int J Biol Sci"},{"key":"pcbi.1009682.ref022","doi-asserted-by":"crossref","first-page":"49504","DOI":"10.1109\/ACCESS.2021.3068413","article-title":"DeepKcrot: A Deep-Learning Architecture for General and Species-Specific Lysine Crotonylation Site Prediction","volume":"9","author":"X Wei","year":"2021","journal-title":"IEEE Access."},{"key":"pcbi.1009682.ref023","first-page":"8","article-title":"DeepKhib: A Deep-Learning Framework for Lysine 2-Hydroxyisobutyrylation Sites Prediction.","author":"L Zhang","year":"2020","journal-title":"Front Cell Dev Biol"},{"key":"pcbi.1009682.ref024","doi-asserted-by":"crossref","first-page":"14244","DOI":"10.1109\/ACCESS.2020.2966592","article-title":"Identification of Protein Lysine Crotonylation Sites by a Deep Learning Framework With Convolutional Neural Networks","volume":"8","author":"Y Zhao","year":"2020","journal-title":"IEEE Access"}],"updated-by":[{"updated":{"date-parts":[[2021,12,20]],"date-time":"2021-12-20T00:00:00Z","timestamp":1639958400000},"DOI":"10.1371\/journal.pcbi.1009682","type":"new_version","label":"New version"}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009682","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,20]],"date-time":"2021-12-20T19:11:08Z","timestamp":1640027468000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009682"}},"subtitle":[],"editor":[{"given":"Edwin","family":"Wang","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,12,8]]},"references-count":24,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,12,8]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009682","relation":{},"ISSN":["1553-7358"],"issn-type":[{"type":"electronic","value":"1553-7358"}],"subject":[],"published":{"date-parts":[[2021,12,8]]}}}