Abstract
Evolutionary algorithms such as genetic programming and grammatical evolution have been used for simultaneously optimizing network architecture, variable selection, and weights for artificial neural networks. Using an evolutionary algorithm to perform variable selection while searching for non-linear interactions is akin to searching for a needle in a haystack. There is, however, a considerable amount of correlation among variables in biological datasets, such as in microarray or genetic studies. Using the XOR problem, we show that correlation between non-functional and functional variables alters the variable selection fitness landscape by broadening the fitness peak over a wider range of potential input variables. Furthermore, when sub-optimal weights are used, local optima in the variable selection fitness landscape appear centered on each of the two functional variables. These attributes of the fitness landscape may supply building blocks for evolutionary search procedures, and may provide a rationale for conducting a local search for variable selection.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Linder, R., Richards, T., Wagner, M.: Microarray data classified by artificial neural networks. Methods Mol. Biol. 382, 345–372 (2007)
Lucek, P., Hanke, J., Reich, J., Solla, S.A., Ott, J.: Multi-locus nonparametric linkage analysis of complex trait loci with neural networks. Hum. Hered. 48(5), 275–284 (1998)
Marinov, M., Weeks, D.: The complexity of linkage analysis with neural networks. Human Heredity 51, 169–176 (2001)
Ott, J.: Neural networks and disease association studies. American Journal of Medical Genetics (Neuropsychiatric Genetics) 105(60), 61 (2001)
Lisboa, P.J., Taktak, A.F.: The use of artificial neural networks in decision support in cancer: a systematic review. Neural Netw. 19(4), 408–415 (2006)
Ohlsson, M.: WeAidU-a decision support system for myocardial perfusion images using artificial neural networks. Artif. Intell. Med. 30(1), 49–60 (2004)
Porter, C.R., Crawford, E.D.: Combining artificial neural networks and transrectal ultrasound in the diagnosis of prostate cancer. Oncology (Williston. Park) 17(10), 1395–1399 (2003)
Sato, F., Shimada, Y., Selaru, F.M., Shibata, D., Maeda, M., Watanabe, G., Mori, Y., Stass, S.A., Imamura, M., Meltzer, S.J.: Prediction of survival in patients with esophageal carcinoma using artificial neural networks. Cancer 103(8), 1596–1605 (2005)
Meiler, J., Baker, D.: Coupled prediction of protein secondary and tertiary structure. Proc. Natl. Acad. Sci. U.S.A 100(21), 12105–12110 (2003)
Bishop, C.M.: Neural Networks for Pattern Recognition, pp. 1–482. Oxford University Press, London (1995)
Yao, X.: Evolving artificial neural networks. Proceedings of the IEEE 87(9), 1423–1447 (1999)
Motsinger, A.A., Lee, S.L., Mellick, G., Ritchie, M.D.: GPNN: power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease. BMC Bioinformatics 7, 39 (2006)
Ritchie, M.D., Coffey, C.S.M.J.H.: Genetic programming neural networks: A bioinformatics tool for human genetics. In: Deb, K., et al. (eds.) GECCO 2004. LNCS, vol. 3102, pp. 438–448. Springer, Heidelberg (2004)
Motsinger-Reif, A.A., Fanelli, T.J., Davis, A.C., Ritchie, M.D.: Power of grammatical evolution neural networks to detect gene-gene interactions in the presence of error. BMC. Res. Notes 1, 65 (2008)
Koza, J., Rice, J.: Genetic generation of both the weights and architecture for a neural network. IEEE Transactions II (1991)
O’Neil, M., Ryan, C.: Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language, 1st edn. Kluwer Academic Publishers, Norwell (2003)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks Proceedings, vol. 4, pp. 1942–1948 (1995)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)
Skinner, A.J., Broughton, J.Q.: Neural networks in computational materials science: training algorithms. Modelling and Simulation in Materials Science and Engineering 3(3), 371–390 (1995)
Likartsis, A., Vlachavas, I., Tsoukalas, L.H.: A new hybrid neural-genetic methodology for improving learning. In: Ninth IEEE International Conference on Tools with Artificial Intelligence Proceedings, pp. 32–36 (1997)
Cantu-Paz, E., Kamath, C.: Evolving neural networks to identify bent-double galaxies in the FIRST survey. Neural Networks 16, 507–517 (2008)
Gibson, G.: Epistasis and pleiotropy as natural properties of transcriptional regulation. Theor. Popul. Biol., 49(1), 58–89 (1996)
Moore, J.H.: The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered. 56(1-3), 73–82 (2003)
Weiss, K.M., Terwilliger, J.D.: How many diseases does it take to map a gene with SNPs? Nat. Genet. 26(2), 151–157 (2000)
Freitas, A.: Understand the Crucial Role of Attribute Interactions in Data Mining, 16th edn., pp. 177–199 (2001)
Li, W., Reich, J.: A complete enumeration and classification of two-locus disease models, 50th edn., pp. 334–349 (2000)
Komili, S., Silver, P.A.: Coupling and coordination in gene expression processes: a systems biology view. Nat. Rev. Genet. 9(1), 38–48 (2008)
International hapmap consortium; The International HapMap Project. Nature 426(6968), 789–796 (2003)
International hapmap consortium; A second generation human haplotype map of over 3.1 million SNPs. Nature 449(7164), 851–861 (2007)
Kruglyak, L.: Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat. Genet. 22(2), 139–144 (1999)
Hill, W.G., Robertson, A.: Linkage disequilibrium in finite populations. Theoretical and Applied Genetics 38(6), 226–231 (1968)
Daqi, G., Yan, J.: Classification methodologies of multilayer perceptrons with sigmoid activation functions. Pattern Recognition 38(10), 1469–1482 (2005)
Barrett, J.C., Fry, B., Maller, J., Daly, M.J.: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21(2), 263–265 (2005)
Fields Development Team, Fields: Tools for Spatial Data, National Center for Atmospheric Research, Boulder, CO (2005), http://www.cgd.ucar.edu/Software/Fields
R Development Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2005) ISBN 3900051070, http://www.R-project.org
Kimura, M.: The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge (1985)
Sabeti, P.C., Reich, D.E., Higgins, J.M., Levine, H.Z., Richter, D.J., Schaffner, S.F., Gabriel, S.B., Platko, J.V., Patterson, N.J., McDonald, G.J., Ackerman, H.C., Campbell, S.J., Altshuler, D., Cooper, R., Kwiatkowski, D., Ward, R., Lander, E.S.: Detecting recent positive selection in the human genome from haplotype structure. Nature 419(6909), 832–837 (2002)
Smith, J.M., Haigh, J.: The hitch-hiking effect of a favourable gene. Genet. Res. 23(1), 23–35 (1974)
Gilad, Y., Rosenberg, S., Przeworski, M., Lancet, D., Skorecki, K.: Evidence for positive selection and population structure at the human MAO-A gene. Proc. Natl. Acad. Sci. U.S.A 99(2), 862–867 (2002)
Tang, K., Wong, L.P., Lee, E.J., Chong, S.S., Lee, C.G.: Genomic evidence for recent positive selection at the human MDR1 gene locus. Hum. Mol. Genet. 13(8), 783–797 (2004)
Ding, Y.C., Chi, H.C., Grady, D.L., Morishima, A., Kidd, J.R., Kidd, K.K., Flodman, P., Spence, M.A., Schuck, S., Swanson, J.M., Zhang, Y.P., Moyzis, R.K.: Evidence of positive selection acting at the human dopamine receptor D4 gene locus. Proc. Natl. Acad. Sci. U.S.A 99(1), 309–314 (2002)
Motsinger, A.A., Reif, D.M., Fanelli, T.J., Davis, A.C., Ritchie, M.D.: Linkage Disequilibrium in Genetic Association Studies Improves the Performance of Grammatical Evolution Neural Networks. In: IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, pp. 1–8 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Turner, S.D., Ritchie, M.D., Bush, W.S. (2009). Conquering the Needle-in-a-Haystack: How Correlated Input Variables Beneficially Alter the Fitness Landscape for Neural Networks. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2009. Lecture Notes in Computer Science, vol 5483. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01184-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-01184-9_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01183-2
Online ISBN: 978-3-642-01184-9
eBook Packages: Computer ScienceComputer Science (R0)