Abstract
Genes need to be investigated either in Gene Interaction Network or in a DNA microarray gene expression data to understand the role they play in complex diseases like cancer. The prioritized genes can help us to know the molecular mechanism, as well as to discover the promising candidates of cancer. Several gene ranking algorithms already have been proposed that produces the top ranked genes according to their importance with respect to a particular disease. In this work, we have developed one Genetic Algorithm (GA) based algorithm, MicroarrayGA, to rank the genes responsible for a particular cancer to occur. The whole research works on six datasets like Colorectal Cancer, Diffuse Large B-Cell Lymphoma, Pediatric Immune Thrombocytopenia (ITP), Small Cell Lung Cancer (SCLC), Breast Cancer and Prostate Cancer, publicly available from NCBI (National Center for Biotechnology Information) online repository. We have validated the outcome of the proposed algorithm by classification step using Support Vector Machine (SVM) classifier and we have also compared the results of MicroarrayGA with three existing methods on the basis of percentage of accuracy, precision, recall, F1-Score and G-Mean metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Defining Cancer: National Cancer Institute, June 2014
Zhang, C., Lu, X., Zhang, X.: Significance of gene ranking for classification of microarray samples. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 3(3), 312–320 (2006)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Boston (1989)
Holland, J.H.: Adaptation in Natural and Artificial Systems, 2nd edn. MIT Press, Cambridge (1975)
Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Proceedings of 5th Annual Workshop on Computational Learning Theory, pp. 144–152. ACM Press (1992)
Zisserman, A.: The SVM Classifier. Lecture Notes (2015)
Wang, Y., et al.: Gene selection from microarray data for cancer classification—a machine learning approach. Comput. Biol. Chem. 29(1), 37–46 (2005)
Yoo, C.K., Leeb, I.B., Vanrolleghema, P.A.: Interpreting patterns and analysis of acute leukemia gene expression data by multivariate fuzzy statistical analysis. In: Proceedings of 14th European Symposium on Computer Aided Process Engineering. ESCAPE-14, vol. 29, no. 6, pp. 1345–1356 (2005)
Peterson, L.E., Coleman, M.A.: Comparison of gene identification based on artificial neural network pre-processing with k-means cluster and principal component analysis. In: Bloch, I., Petrosino, A., Tettamanzi, A.G.B. (eds.) WILF 2005. LNCS (LNAI), vol. 3849, pp. 267–276. Springer, Heidelberg (2006). https://doi.org/10.1007/11676935_33
Liao, C., Li, S., Luo, Z.: Gene selection using Wilcoxon rank sum test and support vector machine for cancer classification. In: Wang, Y., Cheung, Y.-M., Liu, H. (eds.) CIS 2006. LNCS (LNAI), vol. 4456, pp. 57–66. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74377-4_7
West, M., Blanchette, C., Dressman, H., et al.: Predicting the clinical status of human breast cancer using gene expression profiles. Proc. Natl. Acad. Sci. 98, 11462–11467 (2001)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(1999), 531–537 (1999)
Huerta, E.B., Duval, B., Hao, J.-K.: A Hybrid GA/SVM approach for gene selection and classification of microarray data. In: Rothlauf, F., et al. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 34–44. Springer, Heidelberg (2006). https://doi.org/10.1007/11732242_4
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)
Mondal, K.C., Mukhopadhyay, A., Maulik, U., Bandhyapadhyay, S., Pasquier, N.: MOSCFRA: a multi-objective genetic approach for simultaneous clustering and gene ranking. In: Rizzo, R., Lisboa, P.J.G. (eds.) CIBB 2010. LNCS, vol. 6685, pp. 174–187. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21946-7_14
Luque-Baena, R.M., Urda, D., Subirats, J.L., Franco, L., Jerez, J.M.: Analysis of cancer microarray data using constructive neural networks and genetic algorithms. In: 1st International Work-Conference on Bioinformatics and Biomedical Engineering-IWBBIO, Granada, Spain (2013)
Parekh, R., Yang, J., Honavar, V.: Constructive neural-network learning algorithms for pattern classification. IEEE Trans. Neural Netw. 11(2), 436–451 (2000)
Subirats, J.L., Franco, L., Jerez, J.M.: C-Mantec: a novel constructive neural network algorithm incorporating competition between neurons. Neural Netw. 26, 130–140 (2012)
Ghosh, A., Dhara, B.C., De, R.K.: Selection of genes mediating certain cancers, using neuro-fuzzy approach. Neurocomputing 133, 122–140 (2014)
Mandal, M., Mukhopadhyay, A.: A novel PSO-based graph-theoretic approach for identifying most relevant and non-redundant gene markers from gene expression data. Int. J. Parallel Emerg. Distrib. Syst. 30(3), 175–192 (2015)
Soufan, O., Kleftogiannis, D., Kalnis, P., Bajic, V.B.: DWFS: a wrapper feature selection tool based on a parallel genetic algorithm. PLoS One (2015). https://doi.org/10.1371/journal.pone.0117988
Demidenko, E.: Microarray enriched gene rank. BioData Min. 8, 2 (2015). https://doi.org/10.1186/s13040-014-0033-1
Ghosh, A., De, R.K.: Identification of certain cancer mediating genes using Gaussian Fuzzy cluster validity index (GFI). J. Biosci. 40, 741–754 (2015)
Morrison, J.L., Breitling, R., Higham, D.J., Gilbert, D.R.: GeneRank: using search-engine technology for the analysis of microarray experiments. BMC Bioinform. 6(2015), 233–247 (2015)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab, Stanford (1999)
Iatan, I.F.: The fisher’s linear discriminant. In: Borgelt, C., et al. (eds.) Combining Soft Computing and Statistical Methods in Data Analysis, vol. 77, pp. 345–352. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14746-3_43
Khamas, A., Ishikawa, T., Shimokawa, K., Mogushi, K., et al.: Screening for epigenetically masked genes in colorectal cancer using 5-Aza-2′-deoxycytidine, microarray and gene expression profile. Cancer Genom. Proteom. 9(2), 67–75 (2012)
Sato, T., Kaneda, A., Tsuji, S., Isagawa, T., et al.: PRC2 over-expression and PRC2-target gene repression relating to poorer prognosis in small cell lung cancer. Sci. Rep. 3, 1911 (2013)
Singh, D., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2002), 203–209 (2002)
Hans, C.P., Weisenburger, D.D., Greiner, T.C., Gascoyne, R.D., Delabie, J., et al.: Confirmation of the molecular classification of diffuse large B-cell lymphoma by immune histo-chemistry using a tissue microarray. Blood 103(2004), 275–282 (2004)
Shad, A.T., Gonzalez, C.E., Sandler, S.G.: Treatment of immune thrombocytopenic purpura in children: current concepts. Paediatr. Drugs 7(5), 325–336 (2005)
Seal, D.B., Saha, S., Mukherjee, P., Chatterjee, M., Mukherjee, A., Dey, K.N.: Gene ranking: an entropy & decision tree based approach. In: IEEE 7th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York City, NY, USA, pp. 1–5 (2016). https://doi.org/10.1109/UEMCON.2016.7777837
Powers, D.M.W.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Saha, S., Das, P., Ghosh, A., Dey, K.N. (2018). Ranking of Cancer Mediating Genes: A Novel Approach Using Genetic Algorithm in DNA Microarray Gene Expression Dataset. In: Singh, M., Gupta, P., Tyagi, V., Flusser, J., Ören, T. (eds) Advances in Computing and Data Sciences. ICACDS 2018. Communications in Computer and Information Science, vol 906. Springer, Singapore. https://doi.org/10.1007/978-981-13-1813-9_13
Download citation
DOI: https://doi.org/10.1007/978-981-13-1813-9_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1812-2
Online ISBN: 978-981-13-1813-9
eBook Packages: Computer ScienceComputer Science (R0)