A Hybrid Approach to Feature Ranking for Microarray Data Classification | SpringerLink
Skip to main content

A Hybrid Approach to Feature Ranking for Microarray Data Classification

  • Conference paper
Engineering Applications of Neural Networks (EANN 2013)

Abstract

We present a novel approach to multivariate feature ranking in context of microarray data classification that employs a simple genetic algorithm in conjunction with Random forest feature importance measures. We demonstrate performance of the algorithm by comparing it against three popular feature ranking and selection methods on a colon cancer recurrence prediction problem. In addition, we investigate biological relevance of the selected features, finding functional associations of corresponding genes with cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Glas, A.M., Floore, A., Delahaye, L.J., Witteveen, A.T., Pover, R.C., Bakx, N., Lahti- Domenici, J.S., Bruinsma, T.J., Warmoes, M.O., Bernards, R., Wessels, L.F., Van’t Veer, L.J.: Converting a breast cancer microarray signature into a high-throughput diagnostic test. BMC Genomics 7, 278 (2006)

    Article  Google Scholar 

  2. Fraser, A.: Simulation of genetic systems by automatic digital computers. I. Introduction. Aust. J. Biol. Sci. 10, 484–491 (1957)

    Google Scholar 

  3. Holland, J.H.: Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. University of Michigan Press (1975)

    Google Scholar 

  4. Gondro, C., Kinghorn, B.P.: A simple genetic algorithm for multiple sequence alignment. Genetics and Molecular Research 6(4), 964–982 (2007) PMID 18058716

    Google Scholar 

  5. Van Batenburg, F.H., Gultyaev, A.P., Pleij, C.W.: An APL-programmed genetic algorithm for the prediction of RNA secondary structure. Journal of Theoretical Biology 174(3), 269–280 (1995) PMID 7545258, doi:10.1006/jtbi.1995.0098

    Article  Google Scholar 

  6. Popovic, D., Sifrim, A., Pavlopoulos, G.A., Moreau, Y., De Moor, B.: A simple genetic algorithm for biomarker mining. In: Shibuya, T., Kashima, H., Sese, J., Ahmad, S. (eds.) PRIB 2012. LNCS, vol. 7632, pp. 222–232. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  8. Strobl, C., Boulesteix, A.L., Zeileis, A., Hothorn, T.: Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8, 25 (2007)

    Article  Google Scholar 

  9. Huang, X., Pan, W., Grindle, S., Han, X., Chen, Y., Park, S.J., Miller, L.W., Hall, J.: A comparative study of discriminating human heart failure etiology using gene expression profiles. BMC Bioinformatics 6, 205 (2005)

    Article  Google Scholar 

  10. Bureau, A., Dupuis, J., Falls, K., Lunetta, K.L., Hayward, B., et al.: Identifying SNPs predictive of phenotype using random forests. Genetic Epidemiology 28, 171–182 (2005)

    Article  Google Scholar 

  11. Saeys, Y., et al.: A review of feature selection techniques in bioinformatics. Bioinformatics 23, 2507–2517 (2007)

    Article  Google Scholar 

  12. Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall/CRC, Boca Raton (1993)

    Book  MATH  Google Scholar 

  13. Loughrey, J., Cunningham, P.: Overfitting in wrapper-based feature subset se lection: the harder you try the worse it gets. In: Proceedings of International Conference on Innovative Techniques and Applications of Artificial Intelligence, vol. 33, p. 43 (2004)

    Google Scholar 

  14. Loots, G.G., Locksley, R.M., Blankespoor, C.M., Wang, Z.E., Miller, W., Rubin, E.M., Frazer, K.A.: Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 288, 136–140 (2000)

    Article  Google Scholar 

  15. Smith, J.J., Deane, N.G., Wu, F., Merchant, N.B., et al.: Experimentally derived me tastasis gene expression profile predicts recurrence and death in patients with colon cancer. Gastroenterology 138(3), 958–968 (2010)

    Article  Google Scholar 

  16. Kaiser, S., Park, Y.K., Franklin, J.L., Halberg, R.B., et al.: Transcriptional recapitula tion and subversion of embryonic colon development by mouse colon tumor models and human colon cancer. Genome Biol. 8(7), R131 (2007)

    Google Scholar 

  17. Wang, Y., Jatkoe, T., Zhang, Y., Mutch, M.G., Talantov, D., Jiang, J., McLeod, H.L., Atkins, D.: Gene expression profiles and molecular markers to predict recur rence of Dukes’ B colon cancer. J. Clin. Oncol. 22, 1564–1571 (2004)

    Article  Google Scholar 

  18. Jiang, Y., Casey, G., Lavery, I.C., Zhang, Y., Talantov, D., Martin-McGreevy, M., Skacel, M., Manilich, E., Mazumder, A., Atkins, D., Delaney, C.P., Wang, Y.: Development of a clinically feasible molecular assay to predict recurrence of stage II colon cancer. J. Mol. Diagn. 10, 346–354 (2008)

    Article  Google Scholar 

  19. Lin, Y.H., Friederichs, J., Black, M.A., Mages, J., Rosenberg, R., Guilford, P.J., Phillips, V., Thompson-Fawcett, M., Kasabov, N., Toro, T., Merrie, A.E., van Rij, A., Yoon, H.S., McCall, J.L., Siewert, J.R., Holzmann, B., Reeve, A.E.: Multiple gene expression classi fiers from different array platforms predict poor prognosis of colorectal cancer. Clin. Cancer. Res. 13, 498–507 (2007)

    Article  Google Scholar 

  20. Lin, P.C., Lin, S.C., Lee, C.T., Lin, Y.J., Lee, J.C.: Dynamic change of tetraspanin CD151 membrane protein expression in colorectal cancer patients. Cancer Invest. 29(8), 542–547 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Popovic, D., Sifrim, A., Moschopoulos, C., Moreau, Y., De Moor, B. (2013). A Hybrid Approach to Feature Ranking for Microarray Data Classification. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds) Engineering Applications of Neural Networks. EANN 2013. Communications in Computer and Information Science, vol 384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41016-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41016-1_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41015-4

  • Online ISBN: 978-3-642-41016-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics