Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images
- PMID: 22166797
- DOI: 10.1016/j.neuroimage.2011.11.066
Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images
Abstract
There are growing numbers of studies using machine learning approaches to characterize patterns of anatomical difference discernible from neuroimaging data. The high-dimensionality of image data often raises a concern that feature selection is needed to obtain optimal accuracy. Among previous studies, mostly using fixed sample sizes, some show greater predictive accuracies with feature selection, whereas others do not. In this study, we compared four common feature selection methods. 1) Pre-selected region of interests (ROIs) that are based on prior knowledge. 2) Univariate t-test filtering. 3) Recursive feature elimination (RFE), and 4) t-test filtering constrained by ROIs. The predictive accuracies achieved from different sample sizes, with and without feature selection, were compared statistically. To demonstrate the effect, we used grey matter segmented from the T1-weighted anatomical scans collected by the Alzheimer's disease Neuroimaging Initiative (ADNI) as the input features to a linear support vector machine classifier. The objective was to characterize the patterns of difference between Alzheimer's disease (AD) patients and cognitively normal subjects, and also to characterize the difference between mild cognitive impairment (MCI) patients and normal subjects. In addition, we also compared the classification accuracies between MCI patients who converted to AD and MCI patients who did not convert within the period of 12 months. Predictive accuracies from two data-driven feature selection methods (t-test filtering and RFE) were no better than those achieved using whole brain data. We showed that we could achieve the most accurate characterizations by using prior knowledge of where to expect neurodegeneration (hippocampus and parahippocampal gyrus). Therefore, feature selection does improve the classification accuracies, but it depends on the method adopted. In general, larger sample sizes yielded higher accuracies with less advantage obtained by using knowledge from the existing literature.
Copyright © 2011 Elsevier Inc. All rights reserved.
Comment in
-
The utility of data-driven feature selection: re: Chu et al. 2012.Neuroimage. 2014 Jan 1;84:1107-10. doi: 10.1016/j.neuroimage.2013.07.050. Epub 2013 Jul 25. Neuroimage. 2014. PMID: 23891886 Free PMC article.
Similar articles
-
An ensemble learning system for a 4-way classification of Alzheimer's disease and mild cognitive impairment.J Neurosci Methods. 2018 May 15;302:75-81. doi: 10.1016/j.jneumeth.2018.03.008. Epub 2018 Mar 22. J Neurosci Methods. 2018. PMID: 29578038 Free PMC article.
-
Random forest feature selection, fusion and ensemble strategy: Combining multiple morphological MRI measures to discriminate among healhy elderly, MCI, cMCI and alzheimer's disease patients: From the alzheimer's disease neuroimaging initiative (ADNI) database.J Neurosci Methods. 2018 May 15;302:14-23. doi: 10.1016/j.jneumeth.2017.12.010. Epub 2017 Dec 18. J Neurosci Methods. 2018. PMID: 29269320
-
Effects of imaging modalities, brain atlases and feature selection on prediction of Alzheimer's disease.J Neurosci Methods. 2015 Dec 30;256:168-83. doi: 10.1016/j.jneumeth.2015.08.020. Epub 2015 Aug 28. J Neurosci Methods. 2015. PMID: 26318777
-
MRI Radiomics Classification and Prediction in Alzheimer's Disease and Mild Cognitive Impairment: A Review.Curr Alzheimer Res. 2020;17(3):297-309. doi: 10.2174/1567205017666200303105016. Curr Alzheimer Res. 2020. PMID: 32124697 Review.
-
Brain disease research based on functional magnetic resonance imaging data and machine learning: a review.Front Neurosci. 2023 Aug 17;17:1227491. doi: 10.3389/fnins.2023.1227491. eCollection 2023. Front Neurosci. 2023. PMID: 37662098 Free PMC article. Review.
Cited by
-
PRoNTo: pattern recognition for neuroimaging toolbox.Neuroinformatics. 2013 Jul;11(3):319-37. doi: 10.1007/s12021-013-9178-1. Neuroinformatics. 2013. PMID: 23417655 Free PMC article.
-
Locally linear embedding (LLE) for MRI based Alzheimer's disease classification.Neuroimage. 2013 Dec;83:148-57. doi: 10.1016/j.neuroimage.2013.06.033. Epub 2013 Jun 21. Neuroimage. 2013. PMID: 23792982 Free PMC article.
-
Machine Learning for Predicting Cognitive Diseases: Methods, Data Sources and Risk Factors.J Med Syst. 2018 Oct 27;42(12):243. doi: 10.1007/s10916-018-1071-x. J Med Syst. 2018. PMID: 30368611 Review.
-
Trends in Heart-Rate Variability Signal Analysis.Front Digit Health. 2021 Feb 25;3:639444. doi: 10.3389/fdgth.2021.639444. eCollection 2021. Front Digit Health. 2021. PMID: 34713110 Free PMC article. Review.
-
The Added Value of Diffusion-Weighted MRI-Derived Structural Connectome in Evaluating Mild Cognitive Impairment: A Multi-Cohort Validation1.J Alzheimers Dis. 2018;64(1):149-169. doi: 10.3233/JAD-171048. J Alzheimers Dis. 2018. PMID: 29865049 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical