{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,17]],"date-time":"2023-10-17T19:59:48Z","timestamp":1697572788962},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2006,12]]},"abstract":"Abstract<\/jats:title>\n \n Background<\/jats:title>\n Microarray technology has made it possible to simultaneously measure the expression levels of large numbers of genes in a short time. Gene expression data is information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. Clustering is an important tool for finding groups of genes with similar expression patterns in microarray data analysis. However, hard clustering methods, which assign each gene exactly to one cluster, are poorly suited to the analysis of microarray datasets because in such datasets the clusters of genes frequently overlap.<\/jats:p>\n <\/jats:sec>\n \n Results<\/jats:title>\n In this study we applied the fuzzy partitional clustering method known as Fuzzy C-Means (FCM) to overcome the limitations of hard clustering. To identify the effect of data normalization, we used three normalization methods, the two common scale and location transformations and Lowess normalization methods, to normalize three microarray datasets and three simulated datasets. First we determined the optimal parameters for FCM clustering. We found that the optimal fuzzification parameter in the FCM analysis of a microarray dataset depended on the normalization method applied to the dataset during preprocessing. We additionally evaluated the effect of normalization of noisy datasets on the results obtained when hard clustering or FCM clustering was applied to those datasets. The effects of normalization were evaluated using both simulated datasets and microarray datasets. A comparative analysis showed that the clustering results depended on the normalization method used and the noisiness of the data. In particular, the selection of the fuzzification parameter value for the FCM method was sensitive to the normalization method used for datasets with large variations across samples.<\/jats:p>\n <\/jats:sec>\n \n Conclusion<\/jats:title>\n Lowess normalization is more robust for clustering of genes from general microarray data than the two common scale and location adjustment methods when samples have varying expression patterns or are noisy. In particular, the FCM method slightly outperformed the hard clustering methods when the expression patterns of genes overlapped and was advantageous in finding co-regulated genes. Thus, the FCM approach offers a convenient method for finding subsets of genes that are strongly associated to a given cluster.<\/jats:p>\n <\/jats:sec>","DOI":"10.1186\/1471-2105-7-134","type":"journal-article","created":{"date-parts":[[2006,4,20]],"date-time":"2006-04-20T14:23:53Z","timestamp":1145543033000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":39,"title":["Effect of data normalization on fuzzy clustering of DNA microarray data"],"prefix":"10.1186","volume":"7","author":[{"given":"Seo Young","family":"Kim","sequence":"first","affiliation":[]},{"given":"Jae Won","family":"Lee","sequence":"additional","affiliation":[]},{"given":"Jong Sung","family":"Bae","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2006,3,14]]},"reference":[{"key":"873_CR1","doi-asserted-by":"publisher","first-page":"2907","DOI":"10.1073\/pnas.96.6.2907","volume":"96","author":"P Tamayo","year":"1999","unstructured":"Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander E, Golub T: Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc Natl Acad Sci 1999, 96: 2907\u20132912. 10.1073\/pnas.96.6.2907","journal-title":"Proc Natl Acad Sci"},{"key":"873_CR2","doi-asserted-by":"publisher","first-page":"3273","DOI":"10.1091\/mbc.9.12.3273","volume":"9","author":"PT Spellman","year":"1998","unstructured":"Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yest Saccharomyces cerevisiae by microarray hydridization. Mol Biol Cell 1998, 9: 3273\u20133279.","journal-title":"Mol Biol Cell"},{"key":"873_CR3","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1023\/A:1012801612483","volume":"17","author":"M Halkidi","year":"2001","unstructured":"Halkidi M, Batistakis Y, Vazirgiannis M: On clustering validation techniques. Journal of intelligenet information system 2001, 17: 107\u2013145. 10.1023\/A:1012801612483","journal-title":"Journal of intelligenet information system"},{"key":"873_CR4","doi-asserted-by":"publisher","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","volume":"95","author":"MB Eisen","year":"1998","unstructured":"Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proceeding of the National Academy of Sciences 1998, 95: 14863\u201314868. 10.1073\/pnas.95.25.14863","journal-title":"Proceeding of the National Academy of Sciences"},{"key":"873_CR5","first-page":"1690","volume-title":"Bioinformatics Advance Access published February 26","author":"N Belacel","year":"2004","unstructured":"Belacel N, Cuperlovie-Culf M, Laflamme M, Ouellette R: Fuzzy J-Means and VNS methods for clustering genes from microarray data. Bioinformatics Advance Access published February 26 2004, 1690\u20131701."},{"key":"873_CR6","doi-asserted-by":"publisher","first-page":"117","DOI":"10.2307\/2986199","volume":"44","author":"BJT Morgan","year":"1985","unstructured":"Morgan BJT, Ray APG: Non-uniqueness and inversions in clusters analysis. Applied Statistics 1985, 44: 117\u2013134.","journal-title":"Applied Statistics"},{"key":"873_CR7","doi-asserted-by":"publisher","DOI":"10.1002\/9780470316801","volume-title":"Finding groups in data: An introduction to custer analysis","author":"L Kaufman","year":"1990","unstructured":"Kaufman L, Rousseeuw PJ: Finding groups in data: An introduction to custer analysis. New York: John Wiley; 1990."},{"key":"873_CR8","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1038\/10343","volume":"22","author":"S Tavazoie","year":"1999","unstructured":"Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM: Systematic determination of genetic network architecture. Nat Genet 1999, 22: 281\u2013285. 10.1038\/10343","journal-title":"Nat Genet"},{"key":"873_CR9","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-0450-1","volume-title":"Pattern recognition with fuzzy objective function algorithms","author":"JC Bezdek","year":"1981","unstructured":"Bezdek JC: Pattern recognition with fuzzy objective function algorithms. New York: Plenum Press; 1981."},{"key":"873_CR10","first-page":"170","volume-title":"Proceedings of European Symposium on Intelligent Techniques (EIST 2000)","author":"R Guthke","year":"2000","unstructured":"Guthke R, Schmidt-Heck W, Hahn D, Pfaff M: Gene expression data mining for functional genomics. In Proceedings of European Symposium on Intelligent Techniques (EIST 2000). Aachen, Germany; 2000:170\u2013177."},{"key":"873_CR11","doi-asserted-by":"publisher","first-page":"973","DOI":"10.1093\/bioinformatics\/btg119","volume":"19","author":"D Dembele","year":"2003","unstructured":"Dembele D, Kastner P: Fuzzy C-means method for clustering microarray data. Bioinformatics 2003, 19: 973\u2013780. 10.1093\/bioinformatics\/btg119","journal-title":"Bioinformatics"},{"key":"873_CR12","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1126\/science.283.5398.83","volume":"283","author":"VR Iyer","year":"1999","unstructured":"Iyer VR, Eisen MB, Ross DT, Schuler G, Moore T, Lee JCF, Trent JM, Staudt LM, Hudson JJ, Bogosk MS, et al.: The transcriptional program in the response of human fibroblast to serum. Science 1999, 283: 83\u201387. 10.1126\/science.283.5398.83","journal-title":"Science"},{"key":"873_CR13","unstructured":"Supplementary Webpage (Serum)[http:\/\/www-igbmc.u-strasbg.fr\/fcm\/]"},{"key":"873_CR14","doi-asserted-by":"publisher","first-page":"699","DOI":"10.1126\/science.282.5389.699","volume":"282","author":"S Chu","year":"1998","unstructured":"Chu S, DeRisi J, et al.: The transcriptional program of sporulation in budding yeas. Science 1998, 282: 699\u2013705. 10.1126\/science.282.5389.699","journal-title":"Science"},{"key":"873_CR15","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1016\/S1097-2765(00)80114-8","volume":"2","author":"RJ Cho","year":"1998","unstructured":"Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell 1998, 2: 65\u201373. 10.1016\/S1097-2765(00)80114-8","journal-title":"Mol Cell"},{"key":"873_CR16","first-page":"12","volume":"2","author":"SY Kim","year":"2005","unstructured":"Kim SY, Choi TM, Bae JS: Fuzzy types clustering for microaray data. International Journal of Computational Intelligence 2005, 2: 12\u201315.","journal-title":"International Journal of Computational Intelligence"},{"issue":"1","key":"873_CR17","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1186\/1471-2105-5-194","volume":"5","author":"JA Berger","year":"2004","unstructured":"Berger JA, Hautaniemi S, Jarvinen AK, Edgren H, Mitra SK, Astola J: Optimized lowess normalization parameter selection for DNA microarray data. BMC Bioinformatics 2004, 5(1):94. 10.1186\/1471-2105-5-194","journal-title":"BMC Bioinformatics"},{"key":"873_CR18","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1186\/1471-2105-6-28","volume":"6","author":"Y Zhao","year":"2005","unstructured":"Zhao Y, Li MC, Simon R: An adaptive method for cDNA microarray normalization. BMC Bioinformatics 2005, 6: 28. 10.1186\/1471-2105-6-28","journal-title":"BMC Bioinformatics"},{"key":"873_CR19","doi-asserted-by":"publisher","first-page":"418","DOI":"10.1038\/35076576","volume":"2","author":"J Quackenbush","year":"2001","unstructured":"Quackenbush J: Computational analysis of microarray data. Nat Rev Geneti 2001, 2: 418\u2013427. 10.1038\/35076576","journal-title":"Nat Rev Geneti"},{"key":"873_CR20","doi-asserted-by":"publisher","first-page":"459","DOI":"10.1093\/bioinformatics\/btg025","volume":"19","author":"S Datta","year":"2003","unstructured":"Datta S, Datta S: Comparisons and validation of statistical clustering techniques for microarray gene expression. Bioinformatics 2003, 19: 459\u2013466. 10.1093\/bioinformatics\/btg025","journal-title":"Bioinformatics"},{"key":"873_CR21","unstructured":"Supplementary Webpage (Sporulation)[http:\/\/cmgm.stanford.edu\/pbrown\/sporulation\/]"},{"key":"873_CR22","unstructured":"Supplementary Webpage (Yeast)[http:\/\/genome-www.stanford.edu\/cellcycle\/data\/rawdata\/]"},{"key":"873_CR23","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1117\/12.427982","volume-title":"Optical technologies and informatics","author":"YH Yang","year":"2001","unstructured":"Yang YH, Dudoit S, Luu P, Speed TP: Normaliztion for cDNA microarray data. In microarrays. In Optical technologies and informatics. Volume 42. Edited by: San Jose, CA, USA:SPIE. Bittner M, Chen Y, Dorsel A, Dougherty ER; 2001:141\u2013152."},{"key":"873_CR24","unstructured":"Yeung KY, Ruzzo WL: An empirical study on principal component analysis for clustering gene expression data. In Technical Report 2000 UW-CSE-00\u201311\u201301. Department of Computer Science and Engineering, University of Washington;"},{"key":"873_CR25","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1007\/BF01908075","volume":"2","author":"L Huber","year":"1985","unstructured":"Huber L, Arabie P: Comparing partitions. Journal of Classification 1985, 2: 193\u2013218. 10.1007\/BF01908075","journal-title":"Journal of Classification"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-7-134.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T03:18:39Z","timestamp":1630466319000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-7-134"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,3,14]]},"references-count":25,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,12]]}},"alternative-id":["873"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-7-134","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2006,3,14]]},"assertion":[{"value":"23 August 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 March 2006","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 March 2006","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"134"}}