Computational cluster validation in post-genomic data analysis

doi:10.1093/bioinformatics/bti517

Review

. 2005 Aug 1;21(15):3201-12.

doi: 10.1093/bioinformatics/bti517. Epub 2005 May 24.

Computational cluster validation in post-genomic data analysis

Julia Handl¹, Joshua Knowles, Douglas B Kell

Affiliations

PMID: 15914541
DOI: 10.1093/bioinformatics/bti517

Review

Computational cluster validation in post-genomic data analysis

Julia Handl et al. Bioinformatics. 2005.

. 2005 Aug 1;21(15):3201-12.

doi: 10.1093/bioinformatics/bti517. Epub 2005 May 24.

Authors

Julia Handl¹, Joshua Knowles, Douglas B Kell

Affiliation

¹ School of Chemistry, University of Manchester, Faraday Building, Sackville Street, PO Box 88, Manchester M60 1QD, UK. J.Handl@postgrad.manchester.ac.uk

PMID: 15914541
DOI: 10.1093/bioinformatics/bti517

Abstract

Motivation: The discovery of novel biological knowledge from the ab initio analysis of post-genomic data relies upon the use of unsupervised processing methods, in particular clustering techniques. Much recent research in bioinformatics has therefore been focused on the transfer of clustering methods introduced in other scientific fields and on the development of novel algorithms specifically designed to tackle the challenges posed by post-genomic data. The partitions returned by a clustering algorithm are commonly validated using visual inspection and concordance with prior biological knowledge--whether the clusters actually correspond to the real structure in the data is somewhat less frequently considered. Suitable computational cluster validation techniques are available in the general data-mining literature, but have been given only a fraction of the same attention in bioinformatics.

Results: This review paper aims to familiarize the reader with the battery of techniques available for the validation of clustering results, with a particular focus on their application to post-genomic data analysis. Synthetic and real biological datasets are used to demonstrate the benefits, and also some of the perils, of analytical clustervalidation.

Availability: The software used in the experiments is available at http://dbkweb.ch.umist.ac.uk/handl/clustervalidation/.

Supplementary information: Enlarged colour plots are provided in the Supplementary Material, which is available at http://dbkweb.ch.umist.ac.uk/handl/clustervalidation/.

PubMed Disclaimer

Cited by

VAE-Sim: A Novel Molecular Similarity Measure Based on a Variational Autoencoder.
Samanta S, O'Hagan S, Swainston N, Roberts TJ, Kell DB. Samanta S, et al. Molecules. 2020 Jul 29;25(15):3446. doi: 10.3390/molecules25153446. Molecules. 2020. PMID: 32751155 Free PMC article.
A highly efficient multi-core algorithm for clustering extremely large datasets.
Kraus JM, Kestler HA. Kraus JM, et al. BMC Bioinformatics. 2010 Apr 6;11:169. doi: 10.1186/1471-2105-11-169. BMC Bioinformatics. 2010. PMID: 20370922 Free PMC article.
Creating functional groups of marine fish from categorical traits.
Ladds MA, Sibanda N, Arnold R, Dunn MR. Ladds MA, et al. PeerJ. 2018 Oct 23;6:e5795. doi: 10.7717/peerj.5795. eCollection 2018. PeerJ. 2018. PMID: 30370185 Free PMC article.
Statistical power for cluster analysis.
Dalmaijer ES, Nord CL, Astle DE. Dalmaijer ES, et al. BMC Bioinformatics. 2022 May 31;23(1):205. doi: 10.1186/s12859-022-04675-1. BMC Bioinformatics. 2022. PMID: 35641905 Free PMC article.
Face detection in untrained deep neural networks.
Baek S, Song M, Jang J, Kim G, Paik SB. Baek S, et al. Nat Commun. 2021 Dec 16;12(1):7328. doi: 10.1038/s41467-021-27606-9. Nat Commun. 2021. PMID: 34916514 Free PMC article.

See all "Cited by" articles

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

E19354/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Silverchair Information Systems
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Computational cluster validation in post-genomic data analysis

Affiliation

Computational cluster validation in post-genomic data analysis

Authors

Affiliation

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources