Abstract
Taking inspiration from biological evolution, we explore the idea of “Can deep neural networks evolve naturally over successive generations into highly efficient deep neural networks?” by introducing the notion of synthesizing new highly efficient, yet powerful deep neural networks over successive generations via an evolutionary process from ancestor deep neural networks. The architectural traits of ancestor deep neural networks are encoded using synaptic probability models, which can be viewed as the ‘DNA’ of these networks. New descendant networks with differing network architectures are synthesized based on these synaptic probability models from the ancestor networks and computational environmental factor models, in a random manner to mimic heredity, natural selection, and random mutation. These offspring networks are then trained into fully functional networks, like one would train a newborn, and have more efficient, more diverse network architectures than their ancestor networks, while achieving powerful modeling capabilities. Experimental results for the task of visual saliency demonstrated that the synthesized ‘evolved’ offspring networks can achieve state-of-the-art performance while having network architectures that are significantly more efficient (with a staggering \(\sim \) 48-fold decrease in synapses by the fourth generation) compared to the original ancestor network.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Graves A, Mohamed A-R, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: IEEE international conference on acoustics, speech and signal processing, pp 6645–6649
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
Tompson J, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphicalmodel for human pose estimation. In: Proceedings of advances in neural information processing systems (NIPS), pp 1799–1807
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems (NIPS), pp 1097–1105
Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. In: IEEE transactions on pattern analysis and machine intelligence (TPAMI)
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556
Hinton G, Deng L, Yu D, Dahl GE, Mohamed A-R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath TN et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. In: IEEE signal processing magazine
Hannun A, Case C, Casper J, Catanzaro B, Diamos G, Elsen E, Prenger R, Satheesh S, Sengupta S, Coates A et al (2014) Deep speech: scaling up end-to-end speech recognition. CoRR, abs/1412.5567
Amodei D, Anubhai R, Battenberg E, Case C, Casper J, Catanzaro B, Chen J, Chrzanowski M, Coates A, Diamos G et al (2015) Deep speech 2: end-to-end speech recognition in English and Mandarin. CoRR, abs/1512.02595
Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Proceedings of advances in neural information processing systems (NIPS), pp 2377–2385
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
LeCun Y, Denker JS, Solla SA, Howard RE, Jackel LD (1989) Optimal brain damage. In: Advances in neural information processing systems (NIPS)
Gong Y, Liu L, Yang M, Bourdev L (2014) Compressing deep convolutional networks using vector quantization. CoRR, abs/1412.6115
Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149
Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems (NIPS)
Chen W, Wilson JT, Tyree S, Weinberger KQ, Chen Y (2015) Compressing neural networks with the hashing trick. CoRR, abs/1504.04788
Moran D, Softley R, Warrant EJ (2015) The energetic cost of vision and the evolution of eyeless Mexican cavefish. Sci Adv 1:e1500363
Peter GMS, Angeline J, Pollack JB (1994) An evolutionary algorithm that constructs recurrent neural networks. In: IEEE transactions on neural networks
Stanley KO, Bryant BD, Miikkulainen R (2005) Real-time neuroevolution in the NERO video game. In: IEEE transactions on evolutionary computation
Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 10:99–127
Gauci J, Stanley KO (2007) Generating large-scale neural networks through discovering geometric regularities. In: Proceedings of the 9th annual conference on Genetic and evolutionary computation. pp 997–1004
Tirumala SS, Ali S, Ramesh CP (2016) Evolving deep neural networks: a new prospect. In: 12th International conference on natural computation, Fuzzy systems and knowledge discovery (ICNC-FSKD). pp 69-74
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H-Y (2011) Learning to detect a salient object. In: IEEE transactions on pattern analysis and machine intelligence (TPAMI). pp 353–367
Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: IEEE conference on computer vision and pattern recognition (CVPR)
Nitish S, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res JMLR 15:1929–1958
Wan L, Zeiler M, Zhang S, LeCun Y, Fergus R (2013) Regularization of neural networks using dropconnect. In: International conference on machine learning (ICML)
Ioannou Y, Robertson D, Shotton J, Cipolla R, Criminisi A (2015) Training CNNS with low-rank filters for efficient image classification. arXiv preprint arXiv:1511.06744
Jaderberg M, Vedaldi A, Zisserman A (2014) Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866,
Denton E, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. In: Proceedings of advances in neural information processing systems (NIPS). pp 1269–1277
Feng J, Darrell T (2015) Learning the structure of deep convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2749–2757
Liu B, Wang M, Foroosh H, Tappen M, Pensky M (2015) Sparse convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 806–814
Wen W, Wu C, Wang Y, Chen Y, Li H (2016) Learning structured sparsity in deep neural networks. arXiv preprint arXiv:1608.03665
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shafiee, M.J., Mishra, A. & Wong, A. Deep Learning with Darwin: Evolutionary Synthesis of Deep Neural Networks. Neural Process Lett 48, 603–613 (2018). https://doi.org/10.1007/s11063-017-9733-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-017-9733-0