{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,11]],"date-time":"2025-04-11T06:08:39Z","timestamp":1744351719620},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,7,1]]},"abstract":"Abstract<\/jats:title>\n Motivation: Ion mobility spectrometry (IMS) has gained significant traction over the past few years for rapid, high-resolution separations of analytes based upon gas-phase ion structure, with significant potential impacts in the field of proteomic analysis. IMS coupled with mass spectrometry (MS) affords multiple improvements over traditional proteomics techniques, such as in the elucidation of secondary structure information, identification of post-translational modifications, as well as higher identification rates with reduced experiment times. The high throughput nature of this technique benefits from accurate calculation of cross sections, mobilities and associated drift times of peptides, thereby enhancing downstream data analysis. Here, we present a model that uses physicochemical properties of peptides to accurately predict a peptide's drift time directly from its amino acid sequence. This model is used in conjunction with two mathematical techniques, a partial least squares regression and a support vector regression setting.<\/jats:p>\n Results: When tested on an experimentally created high confidence database of 8675 peptide sequences with measured drift times, both techniques statistically significantly outperform the intrinsic size parameters-based calculations, the currently held practice in the field, on all charge states (+2, +3 and +4).<\/jats:p>\n Availability: The software executable, imPredict, is available for download from http:\/omics.pnl.gov\/software\/imPredict.php<\/jats:p>\n Contact: \u00a0rds@pnl.gov<\/jats:p>\n Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq245","type":"journal-article","created":{"date-parts":[[2010,5,22]],"date-time":"2010-05-22T00:55:20Z","timestamp":1274489720000},"page":"1601-1607","source":"Crossref","is-referenced-by-count":42,"title":["Machine learning based prediction for peptide drift times in ion mobility spectrometry"],"prefix":"10.1093","volume":"26","author":[{"given":"Anuj R.","family":"Shah","sequence":"first","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Khushbu","family":"Agarwal","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Erin S.","family":"Baker","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Mudita","family":"Singhal","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Anoop M.","family":"Mayampurath","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Yehia M.","family":"Ibrahim","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Lars J.","family":"Kangas","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Matthew E.","family":"Monroe","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Rui","family":"Zhao","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Mikhail E.","family":"Belov","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Gordon A.","family":"Anderson","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]},{"given":"Richard D.","family":"Smith","sequence":"additional","affiliation":[{"name":"1 Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, 2 School of Informatics, Indiana University, Bloomington, IN 47408 and 3 National Security Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,5,21]]},"reference":[{"key":"2023012507554624700_B1","first-page":"93","article-title":"Predict protein-protein interaction using heuristic approaches","volume-title":"3rd International Conference on Intelligent Sensing and Information Processing.","author":"Agrawal","year":"2005"},{"key":"2023012507554624700_B2","doi-asserted-by":"crossref","first-page":"1176","DOI":"10.1016\/j.jasms.2007.03.031","article-title":"Ion mobility spectrometry-mass spectrometry performance using electrodynamic ion funnels and elevated drift gas pressures","volume":"18","author":"Baker","year":"2007","journal-title":"J. Amer. Soc. Mass Spectrom."},{"key":"2023012507554624700_B3","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1021\/pr900888b","article-title":"An LC-IMS-MS platform providing increased dynamic range for high-throughput proteomic studies","volume":"9","author":"Baker","year":"2009","journal-title":"J. Proteome Res."},{"key":"2023012507554624700_B4","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1093\/bioinformatics\/17.5.455","article-title":"Predicting protein-protein interactions from primary structure","volume":"17","author":"Bock","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012507554624700_B5","doi-asserted-by":"crossref","first-page":"565","DOI":"10.1006\/jmbi.1998.1943","article-title":"Prediction of local structure in proteins using a library of sequence-structure motifs","volume":"281","author":"Bystroff","year":"1998","journal-title":"J. Mol. Biol."},{"key":"2023012507554624700_B6","author":"Chang","year":"2001","journal-title":"LIBSVM: a library for support vector machines."},{"key":"2023012507554624700_B7","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1016\/S0893-6080(03)00169-2","article-title":"Practical selection of SVM parameters and noise estimation for SVM regression","volume":"17","author":"Cherkassky","year":"2004","journal-title":"Neural Netw."},{"key":"2023012507554624700_B8","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1073\/pnas.81.1.140","article-title":"The hydrophobic moment detects periodicity in protein hydrophobicity","volume":"81","author":"Eisenberg","year":"1984","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507554624700_B9","doi-asserted-by":"crossref","first-page":"35","DOI":"10.3233\/ISB-2009-0384","article-title":"SubCellProt: predicting subcellular localization using machine learning approaches","volume":"9","author":"Garg","year":"2009","journal-title":"In Silico Biol."},{"key":"2023012507554624700_B10","volume-title":"Combining SVMs with Various Feature Selection Strategies.","author":"Guyon","year":"2005"},{"key":"2023012507554624700_B11","doi-asserted-by":"crossref","first-page":"2046","DOI":"10.1093\/bioinformatics\/btm302","article-title":"POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions","volume":"23","author":"Hirose","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012507554624700_B12","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1021\/ac9809175","article-title":"ESI\/Ion Trap\/Ion Mobility\/Time-of-Flight mass spectrometry for rapid and sensitive analysis of biomolecular mixtures","volume":"71","author":"Henderson","year":"1998","journal-title":"Anal. Chem."},{"key":"2023012507554624700_B13","first-page":"54","article-title":"Application of ridge analysis to regression problems","volume":"58","author":"Hoerl","year":"1962","journal-title":"Chem. Eng. Prog."},{"key":"2023012507554624700_B14","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1016\/0161-5890(83)90029-9","article-title":"A computer program for predicting protein antigenic determinants","volume":"20","author":"Hopp","year":"1983","journal-title":"Mol. Immunol."},{"key":"2023012507554624700_B15","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1186\/1471-2105-10-87","article-title":"Decon2LS: an open source software package for automated processing and visualization of high resolution mass spectrometry data","volume":"10","author":"Jaitly","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012507554624700_B16","doi-asserted-by":"crossref","first-page":"7397","DOI":"10.1021\/ac052197p","article-title":"Robust algorithm for alignment of liquid chromatography-mass spectrometry analyses in an accurate mass and time tag data analysis pipeline","volume":"78","author":"Jaitly","year":"2006","journal-title":"Anal. Chem."},{"key":"2023012507554624700_B17","first-page":"152","article-title":"Profile-based string kernels for remote homology detection and motif extraction","volume-title":"Computational Systems Bioinformatics Conference (CSB'04).","author":"Kuang","year":"2004"},{"key":"2023012507554624700_B18","article-title":"Simple probabilistic predictions for support vector regression","volume-title":"Technical Report","author":"Lin","year":"2004"},{"key":"2023012507554624700_B19","doi-asserted-by":"crossref","first-page":"1386","DOI":"10.1002\/qsar.200910075","article-title":"Prediction of ion drift times for a proteome-wide peptide set using partial least squares regression, least-squares support vector machine and Gaussian process","volume":"28","author":"Liu","year":"2009","journal-title":"QSAR Comb. Sci."},{"key":"2023012507554624700_B20","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1002\/3527602852","volume-title":"Transport Properties of Ions in Gases.","author":"Mason","year":"1988"},{"key":"2023012507554624700_B21","volume-title":"The Mobility and Diffusion of Ions in Gases.","author":"McDaniel","year":"1973"},{"key":"2023012507554624700_B22","doi-asserted-by":"crossref","first-page":"2021","DOI":"10.1093\/bioinformatics\/btm281","article-title":"VIPER: an advanced software package to support high-throughput LC-MS peptide identification","volume":"23","author":"Monroe","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012507554624700_B23","doi-asserted-by":"crossref","first-page":"1207","DOI":"10.1093\/bioinformatics\/btl055","article-title":"An ensemble of K-local hyperplanes for predicting protein-protein interactions","volume":"22","author":"Nanni","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012507554624700_B24","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1016\/j.aca.2005.11.038","article-title":"Sensitivity and specificity of PLS-class modelling for five sensory characteristics of dry-cured ham using visible and near infrared spectroscopy","volume":"558","author":"Ortiz","year":"2006","journal-title":"Anal. Chim. Acta"},{"key":"2023012507554624700_B25","doi-asserted-by":"crossref","first-page":"621","DOI":"10.2144\/04374RV01","article-title":"Proteomic analyses using an accurate mass and time tag strategy","volume":"37","author":"Pasa-Tolic","year":"2004","journal-title":"Biotechniques"},{"key":"2023012507554624700_B26","doi-asserted-by":"crossref","first-page":"1039","DOI":"10.1021\/ac0205154","article-title":"Use of artificial neural networks for the accurate prediction of peptide liquid chromatography elution times in proteome analyses","volume":"75","author":"Petritis","year":"2003","journal-title":"Anal. Chem."},{"key":"2023012507554624700_B27","doi-asserted-by":"crossref","first-page":"5026","DOI":"10.1021\/ac060143p","article-title":"Improved peptide elution time prediction for reversed-phase liquid chromatography-MS by incorporating peptide sequence information","volume":"78","author":"Petritis","year":"2006","journal-title":"Anal. Chem."},{"key":"2023012507554624700_B28","doi-asserted-by":"crossref","first-page":"59","DOI":"10.2307\/2685263","article-title":"Thirteen ways to look at the correlation coefficient","volume":"42","author":"Rodgers","year":"1988","journal-title":"Am. Statist."},{"key":"2023012507554624700_B29","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1016\/0022-2836(88)90642-0","article-title":"Hydrophobicity of the peptide C = O\u2026H-N hydrogen-bonded group","volume":"201","author":"Roseman","year":"1988","journal-title":"J. Mol. Biol."},{"key":"2023012507554624700_B30","doi-asserted-by":"crossref","first-page":"W321","DOI":"10.1093\/nar\/gkh377","article-title":"The PredictProtein server","volume":"32","author":"Rost","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012507554624700_B31","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1016\/S1387-3806(01)00583-8","article-title":"Analysis of protein mixtures by matrix-assisted laser desorption ionization-ion mobility-orthogonal-time-of-flight mass spectrometry","volume":"219","author":"Ruotolo","year":"2002","journal-title":"Int. J. Mass Spectrom."},{"key":"2023012507554624700_B32","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1023\/A:1009752403260","article-title":"On comparing classifiers: pitfalls to avoid and a recommended approach","volume":"1","author":"Salzberg","year":"1997","journal-title":"Data Min Knowl. Discov."},{"key":"2023012507554624700_B33","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1016\/S1044-0305(01)00269-0","article-title":"Prediction of peptide ion mobilities via a priori calculations from intrinsic size parameters of amino acid residues","volume":"12","author":"Shvartsburg","year":"2001","journal-title":"J. Am. Soc. Mass Spectrom."},{"key":"2023012507554624700_B34","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1023\/B:STCO.0000035301.49549.88","article-title":"A tutorial on support vector regression","volume":"14","author":"Smola","year":"2004","journal-title":"Stat. Comput."},{"key":"2023012507554624700_B35","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1023\/A:1018628609742","article-title":"Least squares support vector machine classifiers","volume":"9","author":"Suykens","year":"1999","journal-title":"Neural Proc. Lett."},{"key":"2023012507554624700_B36","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1007\/s00726-007-0616-y","article-title":"AAIndexLoc: predicting subcellular localization of proteins based on a new representation of sequences using amino acid indices","volume":"35","author":"Tantoso","year":"2008","journal-title":"Amino Acids"},{"key":"2023012507554624700_B37","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1007\/11875741_3","article-title":"Promoter prediction using physico-chemical properties of DNA","volume-title":"Computational Life Sciences II","author":"Uren","year":"2006"},{"key":"2023012507554624700_B38","doi-asserted-by":"crossref","first-page":"1213","DOI":"10.1016\/S1044-0305(98)00101-9","article-title":"Gas-phase separations of protease digests","volume":"9","author":"Valentine","year":"1998","journal-title":"J. Amer. Soc. Mass Spectrom."},{"key":"2023012507554624700_B39","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1016\/S1044-0305(99)00079-3","article-title":"A database of 660 peptide ion cross sections: use of intrinsic size parameters for bona fide predictions of cross sections","volume":"10","author":"Valentine","year":"1999","journal-title":"J. Am. Soc. Mass Spectrom."},{"key":"2023012507554624700_B40","doi-asserted-by":"crossref","first-page":"1203","DOI":"10.1021\/jp983906o","article-title":"Intrinsic amino acid size parameters from a series of 113 lysine-terminated tryptic digest peptide ions","volume":"103","author":"Valentine","year":"1999","journal-title":"J. Phys. Chem. B"},{"key":"2023012507554624700_B41","volume-title":"The Nature of Statistical Learning.","author":"Vapnik","year":"1998"},{"key":"2023012507554624700_B42","doi-asserted-by":"crossref","first-page":"A1","DOI":"10.1186\/1471-2105-10-S7-A1","article-title":"Prediction of peptide drift time in ion mobility-mass spectrometry","volume":"10","author":"Wang","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012507554624700_B43","doi-asserted-by":"crossref","first-page":"1503","DOI":"10.1093\/bioinformatics\/btn218","article-title":"A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics","volume":"24","author":"Webb-Robertson","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012507554624700_B44","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/S0169-7439(01)00155-1","article-title":"PLS-regression: a basic tool of chemometrics","volume":"58","author":"Wold","year":"2001","journal-title":"Chemometrics Intell. Lab. Syst."},{"key":"2023012507554624700_B45","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/j.jtbi.2008.01.028","article-title":"Remote protein homology detection using recurrence quantification analysis and amino acid physicochemical properties","volume":"252","author":"Yang","year":"2008","journal-title":"J. Theor. Biol."},{"key":"2023012507554624700_B46","doi-asserted-by":"crossref","first-page":"3908","DOI":"10.1021\/ac049951b","article-title":"Prediction of low-energy collision-induced dissociation spectra of peptides","volume":"76","author":"Zhang","year":"2004","journal-title":"Anal. Chem."},{"key":"2023012507554624700_B47","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/0022-5193(68)90069-6","article-title":"The characterization of amino acid sequences in proteins by statistical methods","volume":"21","author":"Zimmerman","year":"1968","journal-title":"J. Theor. Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/13\/1601\/48852071\/bioinformatics_26_13_1601.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/13\/1601\/48852071\/bioinformatics_26_13_1601.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T07:56:24Z","timestamp":1674633384000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/13\/1601\/201032"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,5,21]]},"references-count":47,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2010,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq245","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,7,1]]},"published":{"date-parts":[[2010,5,21]]}}}