{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,23]],"date-time":"2024-07-23T08:41:22Z","timestamp":1721724082783},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,2,16]],"date-time":"2022-02-16T00:00:00Z","timestamp":1644969600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,16]],"date-time":"2022-02-16T00:00:00Z","timestamp":1644969600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000066","name":"National Institute of Environmental Health Sciences","doi-asserted-by":"publisher","award":["U2CES030859","P30ES023515","R21ES030882","R01ES031117"],"id":[{"id":"10.13039\/100000066","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Cancer Institute","award":["UH2CA248974"]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2022,12]]},"abstract":"Abstract<\/jats:title>Unknown features in untargeted metabolomics and non-targeted analysis (NTA) are identified using fragment ions from MS\/MS spectra to predict the structures of the unknown compounds. The precursor ion selected for fragmentation is commonly performed using data dependent acquisition (DDA) strategies or following statistical analysis using targeted MS\/MS approaches. However, the selected precursor ions from DDA only cover a biased subset of the peaks or features found in full scan data. In addition, different statistical analysis can select different precursor ions for MS\/MS analysis, which make the post-hoc<\/jats:italic> validation of ions selected following a secondary analysis impossible for precursor ions selected by the original statistical method. Here we propose an automated, exhaustive, statistical model-free workflow: paired mass distance-dependent analysis (PMDDA), for reproducible untargeted mass spectrometry MS2 fragment ion collection of unknown compounds found in MS1 full scan. Our workflow first removes redundant peaks from MS1 data and then exports a list of precursor ions for pseudo-targeted MS\/MS analysis on independent peaks. This workflow provides comprehensive coverage of MS2 collection on unknown compounds found in full scan analysis using a \u201cone peak for one compound\u201d workflow without a priori redundant peak information. We compared pseudo-spectra formation and the number of MS2 spectra linked to MS1 data using the PMDDA workflow to that obtained using CAMERA and RAMclustR algorithms. More annotated compounds, molecular networks, and unique MS\/MS spectra were found using PMDDA compared with CAMERA and RAMClustR. In addition, PMDDA can generate a preferred ion list for iterative DDA to enhance coverage of compounds when instruments support such functions. Finally, compounds with signals in both positive and negative modes can be identified by the PMDDA workflow, to further reduce redundancies. The whole workflow is fully reproducible as a docker image xcmsrocker with both the original data and the data processing template.<\/jats:p>\n Graphical Abstract<\/jats:bold><\/jats:p>","DOI":"10.1186\/s13321-022-00586-8","type":"journal-article","created":{"date-parts":[[2022,2,16]],"date-time":"2022-02-16T08:06:26Z","timestamp":1644998786000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Reproducible untargeted metabolomics workflow for exhaustive MS2 data acquisition of MS1 features"],"prefix":"10.1186","volume":"14","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-2804-6014","authenticated-orcid":false,"given":"Miao","family":"Yu","sequence":"first","affiliation":[]},{"given":"Georgia","family":"Dolios","sequence":"additional","affiliation":[]},{"given":"Lauren","family":"Petrick","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,2,16]]},"reference":[{"key":"586_CR1","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1038\/540153a","volume":"540","author":"M Fessenden","year":"2016","unstructured":"Fessenden M (2016) Metabolomics: small molecules, single cells. Nature 540:153\u2013155. https:\/\/doi.org\/10.1038\/540153a","journal-title":"Nature"},{"key":"586_CR2","doi-asserted-by":"publisher","first-page":"411","DOI":"10.1038\/s41370-017-0012-y","volume":"28","author":"JR Sobus","year":"2018","unstructured":"Sobus JR, Wambaugh JF, Isaacs KK et al (2018) Integrating tools for non-targeted analysis research and chemical safety evaluations at the US EPA. J Expo Sci Environ Epidemiol 28:411\u2013426. https:\/\/doi.org\/10.1038\/s41370-017-0012-y","journal-title":"J Expo Sci Environ Epidemiol"},{"key":"586_CR3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s42004-020-00403-z","volume":"3","author":"M Yu","year":"2020","unstructured":"Yu M, Petrick L (2020) Untargeted high-resolution paired mass distance data mining for retrieving general chemical relationships. Commun Chem 3:1\u20136. https:\/\/doi.org\/10.1038\/s42004-020-00403-z","journal-title":"Commun Chem"},{"key":"586_CR4","doi-asserted-by":"publisher","DOI":"10.1016\/j.trac.2020.115918","volume":"128","author":"Y Tang","year":"2020","unstructured":"Tang Y, Craven CB, Wawryk NJP et al (2020) Advances in mass spectrometry-based omics analysis of trace organics in water. TrAC Trends Anal Chem 128:115918. https:\/\/doi.org\/10.1016\/j.trac.2020.115918","journal-title":"TrAC Trends Anal Chem"},{"key":"586_CR5","doi-asserted-by":"publisher","first-page":"461","DOI":"10.1002\/jms.3782","volume":"51","author":"S Barnes","year":"2016","unstructured":"Barnes S, Benton HP, Casazza K et al (2016) Training in metabolomics research. I. Designing the experiment, collecting and extracting samples and generating metabolomics data. J Mass Spectrom 51:461\u2013475. https:\/\/doi.org\/10.1002\/jms.3782","journal-title":"J Mass Spectrom"},{"key":"586_CR6","doi-asserted-by":"publisher","first-page":"150","DOI":"10.1007\/s11306-019-1612-4","volume":"15","author":"KM Mendez","year":"2019","unstructured":"Mendez KM, Reinke SN, Broadhurst DI (2019) A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification. Metabolomics 15:150. https:\/\/doi.org\/10.1007\/s11306-019-1612-4","journal-title":"Metabolomics"},{"key":"586_CR7","doi-asserted-by":"publisher","first-page":"480","DOI":"10.1021\/acs.analchem.7b03929","volume":"90","author":"X Domingo-Almenara","year":"2018","unstructured":"Domingo-Almenara X, Montenegro-Burke JR, Benton HP, Siuzdak G (2018) Annotation: a computational solution for streamlining metabolomics analysis. Anal Chem 90:480\u2013489. https:\/\/doi.org\/10.1021\/acs.analchem.7b03929","journal-title":"Anal Chem"},{"key":"586_CR8","doi-asserted-by":"publisher","first-page":"e86","DOI":"10.1002\/cpbi.86","volume":"68","author":"J Chong","year":"2019","unstructured":"Chong J, Wishart DS, Xia J (2019) Using MetaboAnalyst 4.0 for comprehensive and integrative metabolomics data analysis. Curr Protoc Bioinformatics 68:e86. https:\/\/doi.org\/10.1002\/cpbi.86","journal-title":"Curr Protoc Bioinformatics"},{"key":"586_CR9","doi-asserted-by":"publisher","DOI":"10.1016\/j.teac.2020.e00099","volume":"28","author":"M Ljoncheva","year":"2020","unstructured":"Ljoncheva M, Stepi\u0161nik T, D\u017eeroski S, Kosjek T (2020) Cheminformatics in MS-based environmental exposomics: current achievements and future directions. Trends Environ Anal Chem 28:e00099. https:\/\/doi.org\/10.1016\/j.teac.2020.e00099","journal-title":"Trends Environ Anal Chem"},{"key":"586_CR10","doi-asserted-by":"publisher","first-page":"1202","DOI":"10.1021\/ac403385y","volume":"86","author":"X Zhu","year":"2014","unstructured":"Zhu X, Chen Y, Subramanian R (2014) Comparison of information-dependent acquisition, SWATH, and MSAll techniques in metabolite identification study employing ultrahigh-performance liquid chromatography-quadrupole time-of-flight mass spectrometry. Anal Chem 86:1202\u20131209. https:\/\/doi.org\/10.1021\/ac403385y","journal-title":"Anal Chem"},{"key":"586_CR11","doi-asserted-by":"publisher","first-page":"8072","DOI":"10.1021\/acs.analchem.9b05135","volume":"92","author":"J Guo","year":"2020","unstructured":"Guo J, Huan T (2020) Comparison of full-scan, data-dependent, and data-independent acquisition modes in liquid chromatography-mass spectrometry based untargeted metabolomics. Anal Chem 92:8072\u20138080. https:\/\/doi.org\/10.1021\/acs.analchem.9b05135","journal-title":"Anal Chem"},{"key":"586_CR12","doi-asserted-by":"publisher","DOI":"10.1016\/j.trac.2018.11.022","volume":"120","author":"WJ Nash","year":"2019","unstructured":"Nash WJ, Dunn WB (2019) From mass to metabolite in human untargeted metabolomics: recent advances in annotation of metabolites applying liquid chromatography-mass spectrometry data. TrAC Trends Anal Chem 120:115324. https:\/\/doi.org\/10.1016\/j.trac.2018.11.022","journal-title":"TrAC Trends Anal Chem"},{"key":"586_CR13","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1016\/j.aca.2017.08.044","volume":"992","author":"Y Wang","year":"2017","unstructured":"Wang Y, Feng R, Wang R et al (2017) Enhanced MS\/MS coverage for metabolite identification in LC-MS-based untargeted metabolomics by target-directed data dependent acquisition with time-staggered precursor ion list. Anal Chim Acta 992:67\u201375. https:\/\/doi.org\/10.1016\/j.aca.2017.08.044","journal-title":"Anal Chim Acta"},{"key":"586_CR14","doi-asserted-by":"publisher","first-page":"908","DOI":"10.1007\/s13361-017-1608-0","volume":"28","author":"JP Koelmel","year":"2017","unstructured":"Koelmel JP, Kroeger NM, Gill EL et al (2017) Expanding lipidome coverage using LC-MS\/MS data-dependent acquisition with automated exclusion list generation. J Am Soc Mass Spectrom 28:908\u2013917. https:\/\/doi.org\/10.1007\/s13361-017-1608-0","journal-title":"J Am Soc Mass Spectrom"},{"key":"586_CR15","doi-asserted-by":"publisher","first-page":"126","DOI":"10.3390\/metabo10040126","volume":"10","author":"I Ten-Dom\u00e9nech","year":"2020","unstructured":"Ten-Dom\u00e9nech I, Mart\u00ednez-Sena T, Moreno-Torres M et al (2020) Comparing targeted vs. untargeted ms2 data-dependent acquisition for peak annotation in LC\u2013MS metabolomics. Metabolites 10:126. https:\/\/doi.org\/10.3390\/metabo10040126","journal-title":"Metabolites"},{"key":"586_CR16","doi-asserted-by":"publisher","first-page":"10397","DOI":"10.1021\/acs.analchem.7b02380","volume":"89","author":"NG Mahieu","year":"2017","unstructured":"Mahieu NG, Patti GJ (2017) Systems-level annotation of a metabolomics data set reduces 25,000 features to fewer than 1000 unique metabolites. Anal Chem 89:10397\u201310406. https:\/\/doi.org\/10.1021\/acs.analchem.7b02380","journal-title":"Anal Chem"},{"key":"586_CR17","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1016\/j.aca.2018.10.062","volume":"1050","author":"M Yu","year":"2019","unstructured":"Yu M, Olkowicz M, Pawliszyn J (2019) Structure\/reaction directed analysis for LC-MS based untargeted analysis. Anal Chim Acta 1050:16\u201324. https:\/\/doi.org\/10.1016\/j.aca.2018.10.062","journal-title":"Anal Chim Acta"},{"key":"586_CR18","doi-asserted-by":"publisher","first-page":"5050","DOI":"10.1021\/acs.analchem.5b00615","volume":"87","author":"P Luo","year":"2015","unstructured":"Luo P, Dai W, Yin P et al (2015) Multiple reaction monitoring-ion pair finder: a systematic approach to transform nontargeted mode to pseudotargeted mode for metabolomics study based on liquid chromatography\u2013mass spectrometry. Anal Chem 87:5050\u20135055. https:\/\/doi.org\/10.1021\/acs.analchem.5b00615","journal-title":"Anal Chem"},{"key":"586_CR19","doi-asserted-by":"publisher","first-page":"3793","DOI":"10.1021\/ac500878x","volume":"86","author":"Z Zeng","year":"2014","unstructured":"Zeng Z, Liu X, Dai W et al (2014) Ion fusion of high-resolution LC-MS-based metabolomics data to discover more reliable biomarkers. Anal Chem 86:3793\u20133800. https:\/\/doi.org\/10.1021\/ac500878x","journal-title":"Anal Chem"},{"key":"586_CR20","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1021\/ac202450g","volume":"84","author":"C Kuhl","year":"2012","unstructured":"Kuhl C, Tautenhahn R, B\u00f6ttcher C et al (2012) CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography\/mass spectrometry data sets. Anal Chem 84:283\u2013289. https:\/\/doi.org\/10.1021\/ac202450g","journal-title":"Anal Chem"},{"key":"586_CR21","doi-asserted-by":"publisher","first-page":"6812","DOI":"10.1021\/ac501530d","volume":"86","author":"CD Broeckling","year":"2014","unstructured":"Broeckling CD, Afsar FA, Neumann S et al (2014) RAMClust: a novel feature clustering method enables spectral-matching-based annotation for metabolomics data. Anal Chem 86:6812\u20136817. https:\/\/doi.org\/10.1021\/ac501530d","journal-title":"Anal Chem"},{"key":"586_CR22","doi-asserted-by":"publisher","DOI":"10.1021\/jacs.9b13198","author":"M Sindelar","year":"2020","unstructured":"Sindelar M, Patti GJ (2020) Chemical discovery in the era of metabolomics. J Am Chem Soc. https:\/\/doi.org\/10.1021\/jacs.9b13198","journal-title":"J Am Chem Soc"},{"key":"586_CR23","doi-asserted-by":"publisher","DOI":"10.1016\/j.aca.2020.11.049","author":"P Liigand","year":"2020","unstructured":"Liigand P, Liigand J, Kaupmees K, Kruve A (2020) 30 years of research on ESI\/MS response: trends, contradictions and applications. Anal Chim Acta. https:\/\/doi.org\/10.1016\/j.aca.2020.11.049","journal-title":"Anal Chim Acta"},{"key":"586_CR24","doi-asserted-by":"publisher","first-page":"D440","DOI":"10.1093\/nar\/gkz1019","volume":"48","author":"K Haug","year":"2020","unstructured":"Haug K, Cochrane K, Nainala VC et al (2020) MetaboLights: a resource evolving in response to the needs of its scientific community. Nucleic Acids Res 48:D440\u2013D444. https:\/\/doi.org\/10.1093\/nar\/gkz1019","journal-title":"Nucleic Acids Res"},{"key":"586_CR25","unstructured":"The Metabolomics Workbench. https:\/\/www.metabolomicsworkbench.org\/. Accessed 10 Jan 2022"},{"issue":"362","key":"586_CR26","doi-asserted-by":"publisher","first-page":"362ps","DOI":"10.1126\/scitranslmed.aaf5027","volume":"8","author":"SN Goodman","year":"2016","unstructured":"Goodman SN, Fanelli D, Ioannidis JPA (2016) What does research reproducibility mean? Sci Transl Med 8(362):362ps. https:\/\/doi.org\/10.1126\/scitranslmed.aaf5027","journal-title":"Sci Transl Med"},{"key":"586_CR27","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0152686","volume":"11","author":"L-H Hung","year":"2016","unstructured":"Hung L-H, Kristiyanto D, Lee SB, Yeung KY (2016) GUIdock: using docker containers with a common graphics user interface to address the reproducibility of research. PLoS ONE 11:e0152686. https:\/\/doi.org\/10.1371\/journal.pone.0152686","journal-title":"PLoS ONE"},{"key":"586_CR28","doi-asserted-by":"publisher","DOI":"10.1201\/b15100","volume-title":"Reproducible research with R and R studio","author":"C Gandrud","year":"2013","unstructured":"Gandrud C (2013) Reproducible research with R and R studio. CRC Press, Boca Raton"},{"key":"586_CR29","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1145\/2723872.2723882","volume":"49","author":"C Boettiger","year":"2015","unstructured":"Boettiger C (2015) An introduction to Docker for reproducible research, with examples from the R environment. ACM SIGOPS Oper Syst Rev 49:71\u201379. https:\/\/doi.org\/10.1145\/2723872.2723882","journal-title":"ACM SIGOPS Oper Syst Rev"},{"key":"586_CR30","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1007\/978-3-642-04898-2_248","volume-title":"International Encyclopedia of Statistical Science","author":"JD Storey","year":"2011","unstructured":"Storey JD (2011) False discovery rate. In: Lovric M (ed) International Encyclopedia of Statistical Science. Springer, Berlin, pp 504\u2013508"},{"key":"586_CR31","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1007\/s11306-019-1564-8","volume":"15","author":"H-J Lee","year":"2019","unstructured":"Lee H-J, Kremer DM, Sajjakulnukit P et al (2019) A large-scale analysis of targeted metabolomics data from heterogeneous biological samples provides insights into metabolite dynamics. Metabolomics 15:103. https:\/\/doi.org\/10.1007\/s11306-019-1564-8","journal-title":"Metabolomics"},{"key":"586_CR32","first-page":"527","volume":"9","author":"C Boettiger","year":"2017","unstructured":"Boettiger C, Eddelbuettel D (2017) An introduction to rocker: docker containers for R. arXiv 9:527\u2013536","journal-title":"arXiv"},{"key":"586_CR33","volume-title":"RStudio: integrated development environment for R","author":"RStudio Team","year":"2020","unstructured":"RStudio Team (2020) RStudio: integrated development environment for R. RStudio PBC, Boston"},{"key":"586_CR34","unstructured":"Yu M (2018) Rocker image for metabolomics data analysis. https:\/\/github.com\/yufree\/xcmsrocker. Accessed 10 Jan 2022"},{"key":"586_CR35","volume-title":"R: a language and environment for statistical computing","author":"R Core Team","year":"2020","unstructured":"R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna"},{"key":"586_CR36","doi-asserted-by":"publisher","first-page":"779","DOI":"10.1021\/ac051437y","volume":"78","author":"CA Smith","year":"2006","unstructured":"Smith CA, Want EJ, O\u2019Maille G et al (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78:779\u2013787. https:\/\/doi.org\/10.1021\/ac051437y","journal-title":"Anal Chem"},{"key":"586_CR37","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1186\/s12859-015-0562-8","volume":"16","author":"G Libiseller","year":"2015","unstructured":"Libiseller G, Dvorzak M, Kleb U et al (2015) IPO: a tool for automated optimization of XCMS parameters. BMC Bioinformatics 16:118. https:\/\/doi.org\/10.1186\/s12859-015-0562-8","journal-title":"BMC Bioinformatics"},{"key":"586_CR38","doi-asserted-by":"publisher","first-page":"918","DOI":"10.1038\/nbt.2377","volume":"30","author":"MC Chambers","year":"2012","unstructured":"Chambers MC, Maclean B, Burke R et al (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30:918\u2013920. https:\/\/doi.org\/10.1038\/nbt.2377","journal-title":"Nat Biotechnol"},{"key":"586_CR39","doi-asserted-by":"publisher","first-page":"828","DOI":"10.1038\/nbt.3597","volume":"34","author":"M Wang","year":"2016","unstructured":"Wang M, Carver JJ, Phelan VV et al (2016) Sharing and community curation of mass spectrometry data with global natural products social molecular networking. Nat Biotechnol 34:828\u2013837. https:\/\/doi.org\/10.1038\/nbt.3597","journal-title":"Nat Biotechnol"},{"key":"586_CR40","unstructured":"Reproducilble Metabolomics WorkFlow. https:\/\/figshare.com\/projects\/Reproducilble_Metabolomics_WorkFlow\/59777. Accessed 10 Jan 2022"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-022-00586-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-022-00586-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-022-00586-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,2,16]],"date-time":"2022-02-16T08:10:31Z","timestamp":1644999031000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-022-00586-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,16]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["586"],"URL":"https:\/\/doi.org\/10.1186\/s13321-022-00586-8","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,16]]},"assertion":[{"value":"23 September 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 February 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 February 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"6"}}