Recommendations for utilizing and reporting population genetic analyses: the reproducibility of genetic clustering using the program STRUCTURE
- PMID: 22998190
- DOI: 10.1111/j.1365-294X.2012.05754.x
Recommendations for utilizing and reporting population genetic analyses: the reproducibility of genetic clustering using the program STRUCTURE
Abstract
Reproducibility is the benchmark for results and conclusions drawn from scientific studies, but systematic studies on the reproducibility of scientific results are surprisingly rare. Moreover, many modern statistical methods make use of 'random walk' model fitting procedures, and these are inherently stochastic in their output. Does the combination of these statistical procedures and current standards of data archiving and method reporting permit the reproduction of the authors' results? To test this, we reanalysed data sets gathered from papers using the software package STRUCTURE to identify genetically similar clusters of individuals. We find that reproducing structure results can be difficult despite the straightforward requirements of the program. Our results indicate that 30% of analyses were unable to reproduce the same number of population clusters. To improve this, we make recommendations for future use of the software and for reporting STRUCTURE analyses and results in published works.
© 2012 Blackwell Publishing Ltd.
Similar articles
-
The effect of close relatives on unsupervised Bayesian clustering algorithms in population genetic structure analysis.Mol Ecol Resour. 2012 Sep;12(5):873-84. doi: 10.1111/j.1755-0998.2012.03156.x. Epub 2012 May 28. Mol Ecol Resour. 2012. PMID: 22639868
-
AMOVA-based clustering of population genetic data.J Hered. 2012 Sep-Oct;103(5):744-50. doi: 10.1093/jhered/ess047. Epub 2012 Aug 15. J Hered. 2012. PMID: 22896561
-
Clustering microarray gene expression data using weighted Chinese restaurant process.Bioinformatics. 2006 Aug 15;22(16):1988-97. doi: 10.1093/bioinformatics/btl284. Epub 2006 Jun 9. Bioinformatics. 2006. PMID: 16766561
-
Comparing algorithms for clustering of expression data: how to assess gene clusters.Methods Mol Biol. 2009;541:479-509. doi: 10.1007/978-1-59745-243-4_21. Methods Mol Biol. 2009. PMID: 19381534 Review.
-
Modern computational approaches for analysing molecular genetic variation data.Nat Rev Genet. 2006 Oct;7(10):759-70. doi: 10.1038/nrg1961. Nat Rev Genet. 2006. PMID: 16983372 Review.
Cited by
-
Contrasting effect of hybridization on genetic differentiation in three rockfish species with similar life history.Evol Appl. 2024 Jul 19;17(7):e13749. doi: 10.1111/eva.13749. eCollection 2024 Jul. Evol Appl. 2024. PMID: 39035131 Free PMC article.
-
Unraveling the genomic landscape of Campylorhynchus wrens along western Ecuador's precipitation gradient: Insights into hybridization, isolation by distance, and isolation by the environment.Ecol Evol. 2024 Jul 11;14(7):e11661. doi: 10.1002/ece3.11661. eCollection 2024 Jul. Ecol Evol. 2024. PMID: 38994212 Free PMC article.
-
Species delimitation, discovery and conservation in a tiger beetle species complex despite discordant genetic data.Sci Rep. 2024 Mar 19;14(1):6617. doi: 10.1038/s41598-024-56875-9. Sci Rep. 2024. PMID: 38503840 Free PMC article.
-
Demographic history and genomic signatures of selection in a widespread vertebrate ectotherm.Mol Ecol. 2024 Mar;33(5):e17269. doi: 10.1111/mec.17269. Epub 2024 Jan 18. Mol Ecol. 2024. PMID: 38234254
-
Small-scale metapopulation structure of a limnophilic fish species in a natural river system investigated using microsatellite genotyping by amplicon sequencing (SSR-GBAS).BMC Ecol Evol. 2024 Jan 2;24(1):1. doi: 10.1186/s12862-023-02192-0. BMC Ecol Evol. 2024. PMID: 38163884 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources