Abstract
A variety of methods are described in the literature to assign peptide sequences to observed tandem MS data. Typically, the identified peptides are associated only with an arbitrary score that reflects the quality of the peptide-spectrum match but not with a statistically meaningful significance measure. In this chapter, we discuss why statistical significance measures can simplify and unify the interpretation of MS-based proteomic experiments. In addition, we also present available software solutions that convert scores into sound statistical measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
de Godoy, L.M., Olsen, J.V., de Souza, G.A., Li, G., Mortensen, P., and Mann, M. (2006) Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biol, 7(6), R50.
McCormack, A.L., Schieltz, D.M., Goode, B., Yang, S., Barnes, G., Drubin, D., and Yates, J.R. III. (1997) Direct analysis and identification of proteins in mixtures by LC/MS/MS and database searching at the low-femtomole level. Anal Chem, 69(4), 767-776.
Nesvizhskii, A.I., Vitek, O., and Aebersold, R. (2007) Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat Meth, 4(10), 787-797.
Eng, J.K., McCormack, A.L., and Yates, J.R. (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom, 5(11), 976-989.
Perkins, D.N., Pappin, D.J., Creasy, D.M., and Cottrell, J.S. (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 20(18), 3551-3567.
Craig, R., and Beavis, R.C. (2004) TANDEM: matching proteins with tandem mass spectra. Bioinformatics, 20(9), 1466-1467.
States, D.J., Omenn, G.S., Blackwell, T.W., Fermin, D., Eng, J., Speicher, D.W., and Hanash, S.M. (2006) Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat Biotechnol, 24(3), 333-338.
Benjamini, Y., and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B, 57(1), 289-300.
Storey, J. D., and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA, 100(16), 9440-9445.
Käll, L., Storey, J.D., MacCoss, M.J., and Noble, W.S. (2008) Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res, 7(1), 29-34.
Käll, L., Storey, J.D., MacCoss, M.J., and Noble, W.S. (2008) Posterior error probabilities and false discovery rates: two sides of the same coin. J Proteome Res, 7(1), 40-44.
Käll, L., Storey, J.D., and Noble, W.S. (2008) Nonparametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry. Bioinformatics, 24(16), i42-i48.
Keller, A., Nesvizhskii, A.I., Kolker, E., and Aebersold, R. (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem, 74(20), 5383-5392.
Moore, R.E., Young, M.K., and Lee, T.D. (2002) Qscore: an algorithm for evaluating SEQUEST database search results. J Am Soc Mass Spectrom, 13(4), 378-386.
Fitzgibbon, M., Li, Q., and McIntosh, M. (2007) Modes of inference for evaluating the confidence of peptide identifications. J. Proteome Res, 7(1), 35-39.
Brosch, M., Swamy, S., Hubbard, T., and Choudhary, J. (2008) Comparison of Mascot and X!Tandem performance for low and high accuracy mass spectrometry and the development of an adjusted Mascot threshold. Mol Cell Proteomics, 7(5), 962-970.
Choi, H., and Nesvizhskii, A.I. (2008) Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. J Proteome Res, 7(1), 254-265.
Choi, H., Ghosh, D., and Nesvizhskii, A.I. (2008) Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling. J Proteome Res, 7(1), 286-292.
Käll, L., Canterbury, J.D., Weston, J., Noble, W.S., and MacCoss, M. J. (2007) Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods, 4(11), 923-925.
Brosch, M., Yu, L., Hubbard, T., and Choudhary, J. (2009) Accurate and sensitive peptide identification with Mascot Percolator. J Proteome Res, 8(6), 3176-3181.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Brosch, M., Choudhary, J. (2010). Scoring and Validation of Tandem MS Peptide Identification Methods. In: Hubbard, S., Jones, A. (eds) Proteome Bioinformatics. Methods in Molecular Biology™, vol 604. Humana Press. https://doi.org/10.1007/978-1-60761-444-9_4
Download citation
DOI: https://doi.org/10.1007/978-1-60761-444-9_4
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60761-443-2
Online ISBN: 978-1-60761-444-9
eBook Packages: Springer Protocols