Abstract
Microarray technique is very useful for measuring expression levels of thousands or more of genes simultaneously. One of challenges in classification of cancer using high-dimensional gene expression data is to select minimal number of relevant genes which can maximize classification accuracy. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, selecting the most informative cancer-related genes from high volume microarray gene expression data is an important and challenging bioinformatics research topic. In the paper, first some important genes are identified based on their rank computed statistically and then rough set theory is applied on reduced gene set for selecting genes with high class-discrimination capability. The method constructs relative discernibility matrix to find out the core genes which are essentially required to distinguish the normal and tumor samples and iteratively adds high ranked noncore genes one at a time to core genes for maximizing classification accuracy. The method is applied on some well known cancerous datasets to show the goodness of the method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Wang, X., Gotoh, O.: Microarray-Based Cancer Prediction Using Soft Computing Approach. Cancer Informatics 7, 123–139 (2009)
Su, Y., Murali, T.M., Pavlovic, V., Schaffer, M., Kasif, S.: RankGene: identification of diagnostic genes based on expression data. Bioinformatics 19, 1578–1579 (2003)
Pal, S.K., Skowron, A. (eds.): Rough Fuzzy Hybridization: A new trend in decision-making, pp. 3–98. Springer, Berlin (1999)
Zhong, N., Dong, J., Ohsuga, S.: Using rough sets with heuristics for feature selection. J. Intell. Inf. Syst., 199–214 (2001)
Pati, S.K., Das, A.K.: Cluster Analysis of Microarray Data Based on Similarity Measurement. International Journal of Bioinformatics Research 3(2), 207–213 (2011) ISSN: 0975-3087
Kerber, R.: ChiMerge: Discretization of Numeric Attributes. In: Proceedings of AAAI 1992, Ninth Int’l Conf. Artificial Intelligence, pp. 123–128. AAAI-Press (1992)
Das, A.K., Sil, J.: An Efficient Classifier Design Integrating Rough Set and Set Oriented Database Operations. Applied Soft Computing, Elsevier Science Direct 11, 2279–2285 (2011)
Pati, S.K., Das, A.K.: Optimal Samples Selection from Gene Expression Microarray Data Using Relational Algebra and Clustering Technique. In: Satapathy, S.C., Avadhani, P.S., Abraham, A. (eds.) Proceedings of the InConINDIA 2012. AISC, vol. 132, pp. 507–514. Springer, Heidelberg (2012)
Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained k-means clustering (Technical Report MSR-TR-2000-65), Microsoft Research, Redmond, WA (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Das, A.K., Pati, S.K. (2012). Gene Subset Selection for Cancer Classification Using Statsitical and Rough Set Approach. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2012. Lecture Notes in Computer Science, vol 7677. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35380-2_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-35380-2_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35379-6
Online ISBN: 978-3-642-35380-2
eBook Packages: Computer ScienceComputer Science (R0)