Abstract
In the last decades, data has grown exponentially with respect to the number of samples and features. This makes the feature selection (FS) more challenging. In this paper, an optimization method called the multimodal optimization (MMO) technique is employed to find multiple optimal solutions instead of a single solution. The main contribution of the MMO technique is to provide multiple optimal solutions, instead of a single solution. Using the hidden information in the data and creating an ensemble of classifiers, the potential and information of multiple answers provided by MMO are used to address the issue of FS from microarray data. After pre-processing of the data, to benefit from the potential and information of multiple answers, the optimal features subset are obtained by a firefly-based MMO algorithm. The mutual information method is used as the fitness function to evaluate the proposed subset of features. Then, each feature subset is used to train a classifier and the classifiers are trained by the data, the features of which are presented by a MMO algorithm, and these classifiers make an ensemble. To select a proper combination, a particle swarm optimization algorithm is used. Finally, the algorithm for the datasets of the microarray is evaluated in terms of cancer diagnosis. The proposed method efficiency is evaluated by applying on 11 datasets. The results indicate the superiority and proper performance of the multimodal FS method compared to other methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
21 June 2024
A Correction to this paper has been published: https://doi.org/10.1007/s11063-024-11549-5
Notes
References
Rakkeitwinai S, Lursinsap C, Aporntewan C, Mutirangura A (2015) New feature selection for gene expression classification based on degree of class overlap in principal dimensions. Comput Biol Med 64:292–298
Zhou W, Dickerson JA (2014) A novel class dependent feature selection method for cancer biomarker discovery. Comput Biol Med 47:66–75
Zhang X, Song Q, Wang G, Zhang K, He L, Jia X (2015) A dissimilarity-based imbalance data classification algorithm. Appl Intell 42(3):544–565
Xiong H, Zhang Y, Chen X-W, Yu J (2010) Cross-platform microarray data integration using the normalised linear transform. Int J Data Min Bioinform 4(2):142–157
Abusamra H (2013) A comparative study of feature selection and classification methods for gene expression data of glioma. Procedia Comput Sci 23:5–14
Wong T-T, Liu K-L (2010) A probabilistic mechanism based on clustering analysis and distance measure for subset gene selection. Expert Syst Appl 37(3):2144–2149
Saeys Y, Inza I, Larranaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Khan MW, Alam M (2012) A survey of application: genomics and genetic programming, a new frontier. Genomics 100(2):65–71
Pihur V, Datta S, Datta S (2008) Finding common genes in multiple cancer types through meta–analysis of microarray experiments: a rank aggregation approach. Genomics 92(6):400–403
Qi Y, Sun H, Sun Q, Pan L (2011) Ranking analysis for identifying differentially expressed genes. Genomics 97(5):326–329
Tan Y-D, Fornage M, Fu Y-X (2006) Ranking analysis of microarray data: a powerful method for identifying differentially expressed genes. Genomics 88(6):846–854
Deligani RJ, Borgheai SB, McLinden J, Shahriari Y (2021) Multimodal fusion of EEG-fNIRS: a mutual information-based hybrid classification framework. Biomed Opt Express 12(3):1635–1650
Dashtban M, Balafar M (2017) Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. Genomics 109(2):91–107
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324
Chen G, Chen J (2015) A novel wrapper method for feature selection and its applications. Neurocomputing 159:219–226
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection”. J Mach Learn Res 3:1157–1182
Guyon I, Gunn S, Nikravesh M, Zadeh LA (2008) Feature extraction: foundations and applications, vol 207. Springer, London
Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recognit Lett 28(13):1825–1844
Hu Z, Bao Y, Xiong T, Chiong R (2015) Hybrid filter–wrapper feature selection for short-term load forecasting. Eng Appl Artif Intell 40:17–27
Unler A, Murat A, Chinnam RB (2011) mr2PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci (Ny) 181(20):4625–4641. https://doi.org/10.1016/j.ins.2010.05.037
Kamyab S, Eftekhari M (2016) Feature selection using multimodal optimization techniques. Neurocomputing 171:586–597. https://doi.org/10.1016/j.neucom.2015.06.068
Nekouie N, Yaghoobi M (2016) A new method in multimodal optimization based on firefly algorithm. Artif Intell Rev 46(2):267–287. https://doi.org/10.1007/s10462-016-9463-0
Salem H, Attiya G, El-Fishawy N (2017) Classification of human cancer diseases by gene expression profiles. Appl Soft Comput 50:124–134. https://doi.org/10.1016/j.asoc.2016.11.026
Lu H, Chen J, Yan K, Jin Q, Xue Y, Gao Z (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62
Chuang L-Y, Yang C-H, Wu K-C, Yang C-H (2011) A hybrid feature selection method for DNA microarray data. Comput Biol Med 41(4):228–237
Lee C-P, Leu Y (2011) A novel hybrid feature selection method for microarray data analysis. Appl Soft Comput 11(1):208–213
Rostami M, Berahmand K, Forouzandeh S (2021) A novel community detection based genetic algorithm for feature selection. J Big Data 8(1):1–27
Shreem SS, Abdullah S, Nazri MZA (2016) Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm. Int J Syst Sci 47(6):1312–1329
Pashaei E, Ozen, M Aydin N (2016) Gene selection and classification approach for microarray data based on random forest ranking and BBHA,” In: 2016 IEEE-EMBS international conference on biomedical and health informatics (BHI), 2016, pp. 308–311
Yang X-S (2020) Nature-inspired optimization algorithms. Academic Press
Aziz RM (2022) Nature-inspired metaheuristics model for gene selection and classification of biomedical microarray data. Med Biol Eng Comput 60(6):1627–1646
Aziz RM, Desai NP, Baluch MF (2022) Computer vision model with novel cuckoo search based deep learning approach for classification of fish image. Multimed Tools Appl 82:1–20
Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput 62:203–215
Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput 43:117–130
Xue Y, Tang T, Pang W, Liu AX (2020) Self-adaptive parameter and strategy based particle swarm optimization for large-scale feature selection problems with multiple classifiers. Appl Soft Comput 88:106031
Song X, Zhang Y, Gong D, Sun X (2021) Feature selection using bare-bones particle swarm optimization with mutual information. Pattern Recognit 112:107804
Kumar L, Bharti KK (2021) A novel hybrid BPSO–SCA approach for feature selection. Nat Comput 20(1):39–61
Alshamlan H, Badr G, Alohali Y (2015) mRMR-ABC: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling. Biomed Res Int. https://doi.org/10.1155/2015/604910
Aziz R, Verma CK, Srivastava N (2017) A novel approach for dimension reduction of microarray. Comput Biol Chem 71:161–169
Aziz RM (2022) Application of nature inspired soft computing techniques for gene selection: a novel frame work for classification of cancer. Soft Comput 26:1–18
Aziz RM (2022) Cuckoo search-based optimization for cancer classification: a new hybrid approach. J Comput Biol 29(6):565–584
Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 1(4):28–39
Hashemi A, Joodaki M, Joodaki NZ, Dowlatshahi MB (2022) Ant Colony Optimization equipped with an ensemble of heuristics through Multi-Criteria Decision Making: A case study in ensemble feature selection. Appl Soft Comput 124:109046
Paniri M, Dowlatshahi MB, Nezamabadi-pour H (2021) Ant-TD: ant colony optimization plus temporal difference reinforcement learning for multi-label feature selection. Swarm Evol Comput 64:100892
Dashtban M, Balafar M, Suravajhala P (2018) Gene selection for tumor classification using a novel bio-inspired multi-objective approach. Genomics 110(1):10–17
Gan J, Warwick K “A genetic algorithm with dynamic niche clustering for multimodal function optimisation,” In: artificial neural nets and genetic algorithms, 1999, pp. 248–255.
Bolón-Canedo V, Sánchez-Marono N, Alonso-Betanzos A, Benítez JM, Herrera F (2014) A review of microarray datasets and applied feature selection methods. Inf Sci (Ny) 282:111–135
Chuang L-Y, Yang C-H, Li J-C, Yang C-H (2012) A hybrid BPSO-CGA approach for gene selection and classification of microarray data. J Comput Biol 19(1):68–82
Gumaei A, Sammouda R, Al-Rakhami M, AlSalman H, El-Zaart A (2021) Feature selection with ensemble learning for prostate cancer diagnosis from microarray gene expression. Health Informatics J 27(1):1460458221989402
Djellali, H, Guessoum S, Ghoualmi-Zine N, Layachi S“Fast correlation based filter combined with genetic algorithm and particle swarm on feature selection,” In: 2017 5th international conference on electrical engineering-boumerdes (ICEE-B), 2017, pp. 1–6.
Alshamlan HM, Badr GH, Alohali YA (2015) Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification. Comput Biol Chem 56:49–60
Kim K-J, Cho S-B (2008) An evolutionary algorithm approach to optimal ensemble classifiers for DNA microarray data analysis. IEEE Trans Evol Comput 12(3):377–388. https://doi.org/10.1109/TEVC.2007.906660
Shukla AK, Singh P, Vardhan M (2018) A hybrid gene selection method for microarray recognition. Biocybern Biomed Eng 38(4):975–991. https://doi.org/10.1016/j.bbe.2018.08.004
Sayed S, Nassef M, Badr A, Farag I (2019) A nested genetic algorithm for feature selection in high-dimensional cancer microarray datasets. Expert Syst Appl 121:233–243
Golub TR et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537
Ben-Dor A, Bruhn L, Friedman N, Nachman I, Schummer M, Yakhini Z “Tissue classification with gene expression profiles,” In: proceedings of the fourth annual international conference on computational molecular biology, 2000, pp. 54–64.
Alon U et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750
Khan J et al (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679
Yang K, Cai Z, Li J, Lin G (2006) A stable gene selection in microarray data analysis. BMC Bioinformatics 7(1):1–16
Bhattacharjee A et al (2001) Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci 98(24):13790–13795
Alizadeh AA et al (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511
Singh D et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209
Armstrong SA et al (2002) MLL translocations specify a distinct gene expression profile that distinguishes a unique Leukemia. Nat Genet 30(1):41–47
L. J. Van’t Veer, et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536
Petricoin EF III et al (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306):572–577
Emary E, Zawbaa HM, Ghany KKA, Hassanien AE, Parv B “Firefly optimization algorithm for feature selection,” In: proceedings of the 7th Balkan conference on informatics conference, 2015, pp. 1–7
Jinthanasatian P, Auephanwiriyakul, S, Theera-Umpon N “Microarray data classification using neuro-fuzzy classifier with firefly algorithm,” In: 2017 IEEE symposium series on computational intelligence (SSCI), 2017, pp. 1–6
Shukla AK, Singh P, Vardhan M (2018) A two-stage gene selection method for biomarker discovery from microarray data for cancer classification. Chemom Intell Lab Syst 183:47–58
Annavarapu CSR, Dara S (2021) Clustering-based hybrid feature selection approach for high dimensional microarray data. Chemom Intell Lab Syst 213:104305
Meenachi L, Ramakrishnan S (2020) Differential evolution and ACO based global optimal feature selection with fuzzy rough set for cancer data classification. Soft Comput 24(24):18463–18475
Meenachi L, Ramakrishnan S (2021) Metaheuristic search based feature selection methods for classification of cancer. Pattern Recognit 119:108079
Meenachi L, Ramakrishnan S (2018) Evolutionary sequential genetic search technique-based cancer classification using fuzzy rough nearest neighbour classifier. Healthc Technol Lett 5(4):130–135
Meenachi L, Ramakrishnan S (2020) Random global and local optimal search algorithm based subset generation for diagnosis of cancer. Curr Med Imaging 16(3):249–261
Myszkowski PB, Olech ŁP, Laszczyk M, Skowroński ME (2018) Hybrid differential evolution and greedy algorithm (DEGR) for solving multi-skill resource-constrained project scheduling problem. Appl Soft Comput 62:1–14
Zheng K, Wang X (2018) Feature selection method with joint maximal information entropy between features and class. Pattern Recognit 77:20–29
Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466
Vivekanandan T, Iyengar NCSN (2017) Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. Comput Biol Med 90:125–136
Lai X, Yue D, Hao J-K, Glover F (2018) Solution-based tabu search for the maximum min-sum dispersion problem. Inf Sci 441:79–94
Author information
Authors and Affiliations
Contributions
All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nekouie, N., Romoozi, M. & Esmaeili, M. A New Evolutionary Ensemble Learning of Multimodal Feature Selection from Microarray Data. Neural Process Lett 55, 6753–6780 (2023). https://doi.org/10.1007/s11063-023-11159-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11159-7