{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,1,16]],"date-time":"2025-01-16T05:14:20Z","timestamp":1737004460868,"version":"3.33.0"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"8","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,4,15]]},"abstract":"Abstract<\/jats:title>Motivation: The nearest shrunken centroid (NSC) method has been successfully applied in many DNA-microarray classification problems. The NSC uses \u2018shrunken\u2019 centroids as prototypes for each class and identifies subsets of genes that best characterize each class. Classification is then made to the nearest (shrunken) centroid. The NSC is very easy to implement and very easy to interpret, however, it has drawbacks.<\/jats:p>Results: We show that the NSC method can be interpreted in the framework of LASSO regression. Based on that, we consider two new methods, adaptive L\u221e-norm penalized NSC (ALP-NSC) and adaptive hierarchically penalized NSC (AHP-NSC), with two different penalty functions for microarray classification, which improve over the NSC. Unlike the L1-norm penalty used in LASSO, the penalty terms that we consider make use of the fact that parameters belonging to one gene should be treated as a natural group. Numerical results indicate that the two new methods tend to remove irrelevant genes more effectively and provide better classification results than the L1-norm approach.<\/jats:p>Availability: R code for the ALP-NSC and the AHP-NSC algorithms are available from authors upon request.<\/jats:p>Contact: \u00a0jizhu@umich.edu<\/jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm046","type":"journal-article","created":{"date-parts":[[2007,3,24]],"date-time":"2007-03-24T23:57:42Z","timestamp":1174780662000},"page":"972-979","source":"Crossref","is-referenced-by-count":36,"title":["Improved centroids estimation for the nearest shrunken centroid classifier"],"prefix":"10.1093","volume":"23","author":[{"given":"Sijian","family":"Wang","sequence":"first","affiliation":[{"name":"1 Department of Biostatistics and 2Department of Statistics, University of Michigan, Ann Arbor, MI, 48109, USA"}]},{"given":"Ji","family":"Zhu","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and 2Department of Statistics, University of Michigan, Ann Arbor, MI, 48109, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,3,24]]},"reference":[{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/35000501","article-title":"Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling","volume":"403","author":"Alizadeh","year":"2000","journal-title":"Nature"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"989","DOI":"10.3150\/bj\/1106314847","article-title":"Some theory for fisher's linear discriminant function, \u201cnaive bayes\u201d, and some alternatives when there are many more variables than observations","volume":"10","author":"Bickel","year":"2004","journal-title":"Bernoulli"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1080\/00401706.1995.10484371","article-title":"Better subset regression using the non-negative garrote","volume":"37","author":"Breiman","year":"1995","journal-title":"Technometrics"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"4148","DOI":"10.1093\/bioinformatics\/bti681","article-title":"Classification of microarrays to nearest centroids","volume":"21","author":"Dabney","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1198\/016214502753479248","article-title":"Comparison of discrimination methods for the classification of tumors using gene expression data","volume":"97","author":"Dudoit","year":"2002","journal-title":"J. Am. Stat. Assoc"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","article-title":"Cluster analysis and display of genome-wide expression patterns","volume":"95","author":"Eisen","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2001-2-1-research0003","article-title":"Supervised harvesting of expression trees","volume":"2","author":"Hastie","year":"2001","journal-title":"Genome Biol"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1038\/89044","article-title":"Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks","volume":"7","author":"Khan","year":"2001","journal-title":"Nat. Med"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"500","DOI":"10.1198\/016214505000000781","article-title":"Multicategory psi-learning","volume":"101","author":"Liu","year":"2006","journal-title":"J. Am. Stat. Assoc"},{"key":"2023041107552217100_","article-title":"Distance weighted discrimination","volume-title":"Technical Report.","author":"Marron","year":"2002"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"546","DOI":"10.1093\/bioinformatics\/18.4.546","article-title":"A comparative review of statistical methods for discovering differently expressed genes in replicated microarray experiments","volume":"18","author":"Pan","year":"2002","journal-title":"Bioinformatics"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1198\/016214502753479356","article-title":"Adaptive model selection","volume":"97","author":"Shen","year":"2002","journal-title":"J. Am. Stat. Assoc"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"2635","DOI":"10.1093\/bioinformatics\/btl442","article-title":"Eigengene-based linear discriminant model for tumor classification using gene expression microarray data","volume":"22","author":"Shen","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. B"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1214\/ss\/1056397488","article-title":"Class prediction by nearest shrunken centroids, with application to DNA microarrays","volume":"18","author":"Tibshirani","year":"2003","journal-title":"Stat. Sci"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1093\/bioinformatics\/bti827","article-title":"Differential gene expression detection and sample classification using penalized linear regression models","volume":"22","author":"Wu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1093\/bioinformatics\/bti736","article-title":"Gene selection using support vector machines with non-convex penalty","volume":"22","author":"Zhang","year":"2006","journal-title":"Bioinformatics"},{"article-title":"Variable selection for multicategory SVM via sup-norm regularization","year":"2006","author":"Zhang","key":"2023041107552217100_"},{"key":"2023041107552217100_","article-title":"Grouped and hierarchical model selection through composite absolute penalties","volume-title":"Technical Report.","author":"Zhao","year":"2006"},{"key":"2023041107552217100_","doi-asserted-by":"crossref","first-page":"1418","DOI":"10.1198\/016214506000000735","article-title":"The adaptive lasso and its oracle properties","volume":"101","author":"Zou","year":"2006","journal-title":"J. Am. Stat. Assoc"},{"key":"2023041107552217100_","article-title":"The F\u221e-norm support vector machine","author":"Zou","year":"2007","journal-title":"Stat. Sin"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/8\/972\/49823404\/bioinformatics_23_8_972.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/8\/972\/49823404\/bioinformatics_23_8_972.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,15]],"date-time":"2025-01-15T06:12:47Z","timestamp":1736921567000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/8\/972\/198185"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,3,24]]},"references-count":23,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2007,4,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm046","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2007,4,15]]},"published":{"date-parts":[[2007,3,24]]}}}