{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,12]],"date-time":"2025-04-12T00:44:22Z","timestamp":1744418662058,"version":"3.37.3"},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2017,3,24]],"date-time":"2017-03-24T00:00:00Z","timestamp":1490313600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000925","name":"John Templeton Foundation","doi-asserted-by":"publisher","award":["51977"],"id":[{"id":"10.13039\/100000925","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,9,1]]},"abstract":"Abstract<\/jats:title>\n \n Motivation<\/jats:title>\n Patient stratification or disease subtyping is crucial for precision medicine and personalized treatment of complex diseases. The increasing availability of high-throughput molecular data provides a great opportunity for patient stratification. Many clustering methods have been employed to tackle this problem in a purely data-driven manner. Yet, existing methods leveraging high-throughput molecular data often suffers from various limitations, e.g. noise, data heterogeneity, high dimensionality or poor interpretability.<\/jats:p>\n <\/jats:sec>\n \n Results<\/jats:title>\n Here we introduced an Entropy-based Consensus Clustering (ECC) method that overcomes those limitations all together. Our ECC method employs an entropy-based utility function to fuse many basic partitions to a consensus one that agrees with the basic ones as much as possible. Maximizing the utility function in ECC has a much more meaningful interpretation than any other consensus clustering methods. Moreover, we exactly map the complex utility maximization problem to the classic K-means clustering problem, which can then be efficiently solved with linear time and space complexity. Our ECC method can also naturally integrate multiple molecular data types measured from the same set of subjects, and easily handle missing values without any imputation. We applied ECC to 110 synthetic and 48 real datasets, including 35 cancer gene expression benchmark datasets and 13 cancer types with four molecular data types from The Cancer Genome Atlas. We found that ECC shows superior performance against existing clustering methods. Our results clearly demonstrate the power of ECC in clinically relevant patient stratification.<\/jats:p>\n <\/jats:sec>\n \n Availability and implementation<\/jats:title>\n The Matlab package is available at http:\/\/scholar.harvard.edu\/yyl\/ecc.<\/jats:p>\n <\/jats:sec>\n \n Supplementary information<\/jats:title>\n Supplementary data are available at Bioinformatics online.<\/jats:p>\n <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx167","type":"journal-article","created":{"date-parts":[[2017,3,23]],"date-time":"2017-03-23T04:47:43Z","timestamp":1490244463000},"page":"2691-2698","source":"Crossref","is-referenced-by-count":78,"title":["Entropy-based consensus clustering for patient stratification"],"prefix":"10.1093","volume":"33","author":[{"given":"Hongfu","family":"Liu","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA"}]},{"given":"Rui","family":"Zhao","sequence":"additional","affiliation":[{"name":"Channing Division of Network Medicine, Brigham and Women\u2019s Hospital, Harvard Medical School, Boston, MA, USA"},{"name":"School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA"}]},{"given":"Hongsheng","family":"Fang","sequence":"additional","affiliation":[{"name":"Channing Division of Network Medicine, Brigham and Women\u2019s Hospital, Harvard Medical School, Boston, MA, USA"},{"name":"Department of Statistics, Stanford University, Stanford, CA, USA"}]},{"given":"Feixiong","family":"Cheng","sequence":"additional","affiliation":[{"name":"Center for Complex Network Research and Department of Physics, Northeastern University, Boston, MA, USA"},{"name":"Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Boston, MA, USA"}]},{"given":"Yun","family":"Fu","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA"},{"name":"College of Computer and Information Science, Northeastern University, Boston, MA, USA"}]},{"given":"Yang-Yu","family":"Liu","sequence":"additional","affiliation":[{"name":"Channing Division of Network Medicine, Brigham and Women\u2019s Hospital, Harvard Medical School, Boston, MA, USA"},{"name":"Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Boston, MA, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,3,24]]},"reference":[{"key":"2023020206261892900_btx167-B1","doi-asserted-by":"crossref","first-page":"4006","DOI":"10.1038\/ncomms5006","article-title":"Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach","volume":"4","author":"Aerts","year":"2014","journal-title":"Nat. Commun"},{"key":"2023020206261892900_btx167-B2","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1038\/nm.3984","article-title":"Pan-cancer analysis of the extent and consequences of intratumor heterogeneity","volume":"22","author":"Andor","year":"2016","journal-title":"Nat. Med"},{"key":"2023020206261892900_btx167-B3","doi-asserted-by":"crossref","first-page":"693","DOI":"10.1038\/nrclinonc.2015.123","article-title":"Precision medicine for metastatic breast cancer\u2014limitations and solutions","volume":"12","author":"Arnedos","year":"2015","journal-title":"Nat. Rev. Clin. Oncol"},{"key":"2023020206261892900_btx167-B4","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1038\/nature15819","article-title":"Patient-centric trials for therapeutic development in precision oncology","volume":"526","author":"Biankin","year":"2015","journal-title":"Nature"},{"key":"2023020206261892900_btx167-B5","doi-asserted-by":"crossref","first-page":"5394","DOI":"10.1073\/pnas.1601591113","article-title":"Big data visualization identifies the multidimensional molecular landscape of human gliomas","volume":"113","author":"Bolouri","year":"2016","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020206261892900_btx167-B6","doi-asserted-by":"crossref","first-page":"3738","DOI":"10.1073\/pnas.0409462102","article-title":"Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival","volume":"102","author":"Chang","year":"2005","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020206261892900_btx167-B7","doi-asserted-by":"crossref","first-page":"12253","DOI":"10.1073\/pnas.1304376110","article-title":"Biclustering with heterogeneous variance","volume":"110","author":"Chen","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020206261892900_btx167-B8","doi-asserted-by":"crossref","first-page":"819","DOI":"10.1126\/science.1231143","article-title":"Multiplex genome engineering using crispr\/cas systems","volume":"339","author":"Cong","year":"2013","journal-title":"Science"},{"key":"2023020206261892900_btx167-B9","first-page":"497","article-title":"Clustering cancer gene expression data: a comparative study","volume":"9","author":"de Souto","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020206261892900_btx167-B10","doi-asserted-by":"crossref","first-page":"1102","DOI":"10.1038\/nbt.2749","article-title":"Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data","volume":"31","author":"Denny","year":"2013","journal-title":"Nat. Biotechnol"},{"key":"2023020206261892900_btx167-B11","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1109\/TPAMI.2005.113","article-title":"Combining multiple clusterings using evidence accumulation","volume":"27","author":"Fred","year":"2005","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell"},{"key":"2023020206261892900_btx167-B12","first-page":"57","volume-title":"International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics","author":"Galdi","year":"2014"},{"key":"2023020206261892900_btx167-B13","doi-asserted-by":"crossref","first-page":"938","DOI":"10.1038\/nm.3909","article-title":"The prognostic landscape of genes and infiltrating immune cells across human cancers","volume":"21","author":"Gentles","year":"2015","journal-title":"Nat. Med"},{"key":"2023020206261892900_btx167-B14","doi-asserted-by":"crossref","first-page":"1513","DOI":"10.1093\/bioinformatics\/btq226","article-title":"Lce: a link-based cluster ensemble method for improved gene expression data analysis","volume":"26","author":"Iam-On","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020206261892900_btx167-B15","first-page":"E5486\u2013E54","article-title":"Comprehensive assessment of cancer missense mutation clustering in protein structures","volume":"12","author":"Kamburov","year":"2015","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020206261892900_btx167-B16","doi-asserted-by":"crossref","first-page":"811\u201381","DOI":"10.1073\/pnas.0304146101","article-title":"Gene expression profiling identifies clinically relevant subtypes of prostate cancer","volume":"101,","author":"Lapointe","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"year":"2015","author":"Liu","key":"2023020206261892900_btx167-B17"},{"year":"2015","author":"Liu","key":"2023020206261892900_btx167-B18"},{"year":"2016","author":"Liu","key":"2023020206261892900_btx167-B27"},{"key":"2023020206261892900_btx167-B19","doi-asserted-by":"crossref","first-page":"2263","DOI":"10.1093\/bioinformatics\/btr373","article-title":"Genenetweaver: in silico benchmark generation and performance profiling of network inference methods","volume":"27","author":"Schaffter","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020206261892900_btx167-B20","first-page":"583","article-title":"Cluster ensembles\u2014a knowledge reuse framework for combining partitions","volume":"3","author":"Strehl","year":"2002","journal-title":"J. Mach. Learn. Res"},{"year":"2003","author":"Topchy","key":"2023020206261892900_btx167-B21"},{"key":"2023020206261892900_btx167-B22","doi-asserted-by":"crossref","first-page":"862","DOI":"10.15252\/msb.20155865","article-title":"Transcriptomics resources of human tissues and organs","volume":"12","author":"Uhlen","year":"2016","journal-title":"Mol. Syst. Biol"},{"year":"2009","author":"Wu","key":"2023020206261892900_btx167-B23"},{"key":"2023020206261892900_btx167-B24","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1109\/TKDE.2014.2316512","article-title":"K-means-based consensus clustering: a unified view","volume":"27","author":"Wu","year":"2015","journal-title":"IEEE Trans. Knowl. Data Eng"},{"key":"2023020206261892900_btx167-B25","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1038\/nature13438","article-title":"Proteogenomic characterization of human colon and rectal cancer","volume":"513","author":"Zhang","year":"2014","journal-title":"Nature"},{"key":"2023020206261892900_btx167-B26","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1038\/nmeth.3249","article-title":"Targeted exploration and analysis of large cross-platform human transcriptomic compendia","volume":"12","author":"Zhu","year":"2015","journal-title":"Nat. Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/17\/2691\/49040637\/bioinformatics_33_17_2691.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/17\/2691\/49040637\/bioinformatics_33_17_2691.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T06:28:37Z","timestamp":1675319317000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/17\/2691\/3089941"}},"subtitle":[],"editor":[{"given":"Ziv","family":"Bar-Joseph","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,3,24]]},"references-count":27,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2017,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx167","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/073189","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2017,9,1]]},"published":{"date-parts":[[2017,3,24]]}}}