Abstract
Fundamental endeavour to understand microbiome and its functions starts with detecting which microbes are present in the samples and continues with comparing different samples and finding similar based on their community compositions. Pervasive method to accomplish these steps is clustering. However clustering brings number of possibilities regarding algorithms, parameters, distance/similarity metrics, etc., that produce different outcomes making it hard to interpret results. The study presented here examines the stability of clusters in the context of various beta diversity metrics applied on human microbiome samples. We explored the effects of 24 different diversity metrics on clustering outcomes and their impact on the accuracy of the clustering of microbiome samples. To overcome obscure results coming from individual clusterings that rely on distinct beta diversity metrics we employed two ensemble approaches to integrate results of individual clusterings. Obtained results on human microbiome data imply that ensemble clustering approaches produce stable results in reconstructing clusters that correspond to the different host and body habitat.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhou, J., He, Z., Yang, Y., Deng, Y., Tringe, S.G., Alvarez-Cohen, L.: High-throughput metagenomic technologies for complex microbial community analysis: open and closed formats. MBio 6(1), e02288–14 (2015)
Mendoza, M.L.Z., Sicheritz-Pontn, T., Gilbert, M.T.P.: Environmental genes and genomes: understanding the differences and challenges in the approaches and software for their analyses. Briefings Bioinform. 6(5), 745–758 (2015)
He, Y., et al.: Stability of operational taxonomic units: an important but neglected property for analyzing microbial diversity. Microbiome 3(1), 20 (2015)
Kuczynski, J., et al.: Microbial community resemblance methods differ in their ability to detect biologically relevant patterns. Nature Methods 7(10), 813–819 (2010)
Koren, O., et al.: A guide to enterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets. PLoS Comput. Biol. 9(1), e1002863 (2013)
Yang, P., et al.: Microbial community pattern detection in human body habitats via ensemble clustering framework. BMC Syst. Biol. 8(Suppl 4), S7 (2014)
Legendre, P., Cáceres, M.: Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecol. Lett. 16(8), 951–963 (2013)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2, 849–856 (2002)
Brdar, S., Crnojević, V., Zupan, B.: Integrative clustering by nonnegative matrix factorization can reveal coherent functional groups from gene profile data. IEEE J. Biomed. Health Inf. 19(2), 698–708 (2015)
Monti, S., Tamayo, P., Golub, T.: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52(1), 91–118 (2003)
Caporaso, J.G., et al.: Moving pictures of the human microbiome. Genome Biol. 12(5), R50 (2011)
Wilke, A., et al.: A RESTful API for accessing microbial community data for MG-RAST. PLoS Comput. Biol. 11(1), e1004008 (2015)
Caporaso, J.G., et al.: QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7(5), 335–336 (2010)
Rideout, J.R., et al.: Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences. PeerJ 2, e545 (2014)
Edgar, R.: Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26(19), 2460–2461 (2010)
DeSantis, T.Z., et al.: Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72(7), 5069–5072 (2006)
Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL, vol. 7 (2007)
Hubert, L., Phipps, A.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik Karlsruh (2007)
Acknowledgements
This work was partly supported by Serbian Ministry of Education and Science (Project III 44006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Brdar, S., Crnojević, V. (2017). Ensemble Approaches for Stable Assessment of Clusters in Microbiome Samples. In: Bracciali, A., Caravagna, G., Gilbert, D., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2016. Lecture Notes in Computer Science(), vol 10477. Springer, Cham. https://doi.org/10.1007/978-3-319-67834-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-67834-4_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67833-7
Online ISBN: 978-3-319-67834-4
eBook Packages: Computer ScienceComputer Science (R0)