Abstract
Deep evolutionary network structured representation (DENSER) is a novel evolutionary approach for the automatic generation of deep neural networks (DNNs) which combines the principles of genetic algorithms (GAs) with those of dynamic structured grammatical evolution (DSGE). The GA-level encodes the macro structure of evolution, i.e., the layers, learning, and/or data augmentation methods (among others); the DSGE-level specifies the parameters of each GA evolutionary unit and the valid range of the parameters. The use of a grammar makes DENSER a general purpose framework for generating DNNs: one just needs to adapt the grammar to be able to deal with different network and layer types, problems, or even to change the range of the parameters. DENSER is tested on the automatic generation of convolutional neural networks (CNNs) for the CIFAR-10 dataset, with the best performing networks reaching accuracies of up to 95.22%. Furthermore, we take the fittest networks evolved on the CIFAR-10, and apply them to classify MNIST, Fashion-MNIST, SVHN, Rectangles, and CIFAR-100. The results show that the DNNs discovered by DENSER during evolution generalise, are robust, and scale. The most impressive result is the 78.75% classification accuracy on the CIFAR-100 dataset, which, to the best of our knowledge, sets a new state-of-the-art on methods that seek to automatically design CNNs.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Grid search methods that adapt the resolution of the grid in run-time.
To compute the maximum number of expansions of each non-terminal symbol the grammar must be pre-processed; for further details check Section 3.1 of [37].
Recall that in the context of this work a evolutionary unit can be rather a layer, or any set of arguments related with for example learning or data augmentation.
For more information about the MNIST variants check http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Public/MnistVariations.
More information regarding the Rectangles dataset can be found in http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Public/RectanglesData.
An invalid shape in this scenario occurs when the downsampling generates a negative size for the shape of the output of the layer.
It is later shown that this network reports as one of the highest performing in terms of test accuracy, and thus it generalises well.
References
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, Tensorflow: a system for large-scale machine learning. OSDI 16, 265–283 (2016)
F. Ahmadizar, K. Soltanian, F. AkhlaghianTab, I. Tsoulos, Artificial neural network development by means of a novel combination of grammatical evolution and genetic algorithm. Eng. Appl. Artif. Intell. 39, 1–13 (2015)
F. Assunção, N. Lourenço, P. Machado, B. Ribeiro, Evolving the topology of large scale deep neural networks, in European Conference on Genetic Programming. Springer, pp. 19–34 (2018)
F. Assunção, N. Lourenço, P. Machado, B. Ribeiro, Towards the evolution of multi-layered neural networks: A dynamic structured grammatical evolution approach, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO’17. ACM, New York, NY, USA, pp. 393–400 (2017). https://doi.org/10.1145/3071178.3071286
T. Bäck, H.P. Schwefel, An overview of evolutionary algorithms for parameter optimization. Evol. Comput. 1(1), 1–23 (1993)
B. Baker, O. Gupta, N. Naik, R. Raskar, Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016)
A. Baldominos, Y. Saez, P. Isasi, Evolutionary design of convolutional neural networks for human activity recognition in sensor-rich environments. Sensors (14248220) 18(4) (2018)
J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
J.S. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter optimization, in Advances in Neural Information Processing Systems, pp. 2546–2554 (2011)
C.M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics) (Springer, Berlin, Heidelberg, 2006)
F. Chollet, et al.: Keras. https://keras.io (2015)
Z. Chunhong, J. Licheng, Automatic parameters selection for SVM based on GA, in Intelligent Control and Automation, 2004. WCICA 2004. Fifth World Congress on, vol. 2. IEEE, pp. 1869–1872 (2004)
K.B. Duan, S.S. Keerthi, Which is the best multiclass SVM method? An empirical study, in International Workshop on Multiple Classifier Systems. Springer, pp. 278–285 (2005)
C. Fernando, D. Banarse, C. Blundell, Y. Zwols, D. Ha, A.A. Rusu, A. Pritzel, D. Wierstra, Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734 (2017)
M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, F. Hutter, Efficient and robust automated machine learning, in Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)
D. Floreano, P. Dürr, C. Mattiussi, Neuroevolution: from architectures to learning. Evol. Intell. 1(1), 47–62 (2008)
F. Gomez, J. Schmidhuber, R. Miikkulainen, Accelerated neural evolution through cooperatively coevolved synapses. J. Mach. Learn. Res. 9(May), 937–965 (2008)
I. Goodfellow, Y. Bengio, A. Courville, Y. Bengio, Deep Learning, vol. 1 (MIT Press, Cambridge, 2016)
I.J. Goodfellow, Y. Bulatov, J. Ibarz, S. Arnoud, V. Shet, Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082 (2013)
B. Graham, Fractional max-pooling. arXiv preprint arXiv:1412.6071 (2014)
I. Guyon, K. Bennett, G. Cawley, H.J. Escalante, S. Escalera, T.K. Ho, N. Macia, B. Ray, M. Saeed, A. Statnikov, et al.: Design of the 2015 ChaLearn AutoML challenge, in Neural Networks (IJCNN), 2015 International Joint Conference on. IEEE, pp. 1–8 (2015)
I. Guyon, I. Chaabane, H.J. Escalante, S. Escalera, D. Jajetic, J.R. Lloyd, N. Macià, B. Ray, L. Romaszko, M. Sebag, et al.: A brief review of the ChaLearn AutoML challenge: any-time any-dataset learning without human intervention, in Workshop on Automatic Machine Learning, pp. 21–30 (2016)
N. Hansen, A. Ostermeier, Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)
S.A. Harp, T. Samad, A. Guha, Designing application-specific neural networks using the genetic algorithm, in Advances in Neural Information Processing Systems, pp. 447–454 (1990)
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Á.B. Jiménez, J.L. Lázaro, J.R. Dorronsoro, Finding optimal model parameters by deterministic and annealed focused grid search. Neurocomputing 72(13–15), 2824–2832 (2009)
J.Y. Jung, J.A. Reggia, Evolutionary design of neural network architectures using a descriptive encoding language. IEEE Trans. Evol. Comput. 10(6), 676–688 (2006)
J.D. Kelly Jr., L. Davis, A hybrid genetic algorithm for classification, in Proceedings of the 12th International Joint Conference on Artificial Intelligence, Sydney, Australia, ed. by J. Mylopoulos, R. Reiter (Morgan Kaufmann, San Francisco, 1991), pp. 645–650. http://ijcai.org/Proceedings/91-2/Papers/006.pdf
D. Khritonenko, V. Stanovov, E. Semenkin, Applying an instance selection method to an evolutionary neural classifier design, in IOP Conference Series: Materials Science and Engineering, vol. 173. IOP Publishing, p. 012007 (2017)
H. Kitano, Designing neural networks using genetic algorithms with graph generation system. Complex Syst. 4(4), 461–476 (1990)
B. Komer, J. Bergstra, C. Eliasmith, Hyperopt-sklearn: automatic hyperparameter configuration for scikit-learn, in ICML Workshop on AutoML (2014)
A. Krizhevsky, G. Hinton, Learning multiple layers of features from tiny images (2009)
Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
F.H.F. Leung, H.K. Lam, S.H. Ling, P.K.S. Tam, Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans. Neural Netw. 14(1), 79–88 (2003)
I. Loshchilov, F. Hutter, CMA-ES for hyperparameter optimization of deep neural networks. arXiv preprint arXiv:1604.07269 (2016)
N. Lourenço, F. Assunção, F.B. Pereira, E. Costa, P. Machado, Structured grammatical evolution: a dynamic approach, in Handbook of Grammatical Evolution, ed. by C. Ryan, M. O’Neill, J. Collins (Springer, Berlin, 2018). https://doi.org/10.1007/978-3-319-78717-6
N. Lourenço, F.B. Pereira, E. Costa, Unveiling the properties of structured grammatical evolution. Genet. Program. Evol. Mach. 17(3), 251–289 (2016)
R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, A. Navruzyan, N. Duffy, B. Hodjat, Evolving deep neural networks. arXiv preprint arXiv:1703.00548 (2017)
J.F. Miller, Cartesian genetic programming, in Cartesian Genetic Programming. Springer, pp. 17–34 (2011)
J. Močkus, On bayesian methods for seeking the extremum, in Optimization Techniques IFIP Technical Conference. Springer, pp. 400–404 (1975)
D.E. Moriarty, R. Miikkulainen, Forming neural networks through efficient and adaptive coevolution. Evol. Comput. 5(4), 373–399 (1997)
G. Morse, K.O. Stanley, Simple evolutionary optimization can rival stochastic gradient descent in neural networks, in Proceedings of the 2016 on Genetic and Evolutionary Computation Conference. ACM, pp. 477–484 (2016)
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A.Y. Ng, Reading digits in natural images with unsupervised feature learning, in NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, p. 5 (2011)
M. O’Neil, C. Ryan, Grammatical evolution, in Grammatical Evolution. Springer, pp. 33–47 (2003)
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
A. Radi, R. Poli, Discovering efficient learning rules for feedforward neural networks using genetic programming, in Recent Advances in Intelligent Paradigms and Applications. Springer, pp. 133–159 (2003)
E. Real, S. Moore, A. Selle, S. Saxena, Y.L. Suematsu, Q. Le, A. Kurakin, Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017)
M. Rocha, P. Cortez, J. Neves, Evolution of neural networks for classification and regression. Neurocomputing 70(16), 2809–2816 (2007)
B. Schuller, S. Reiter, G. Rigoll, Evolutionary feature generation in speech emotion recognition, in 2006 IEEE International Conference on Multimedia and Expo. IEEE, pp. 5–8 (2006)
P. Sermanet, S. Chintala, Y. LeCun, Convolutional neural networks applied to house numbers digit classification, in 2012 21st International Conference on Pattern Recognition (ICPR). IEEE, pp. 3288–3291 (2012)
P.Y. Simard, D. Steinkraus, J.C. Platt et al., Best practices for convolutional neural networks applied to visual document analysis. ICDAR 3, 958–962 (2003)
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
J. Snoek, H. Larochelle, R.P. Adams, Practical Bayesian optimization of machine learning algorithms, in Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)
J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram, M. Patwary, M. Prabhat, R. Adams, Scalable Bayesian optimization using deep neural networks, in International Conference on Machine Learning, pp. 2171–2180 (2015)
K. Soltanian, F.A. Tab, F.A. Zar, I. Tsoulos, Artificial neural networks generation using grammatical evolution, in 2013 21st Iranian Conference on Electrical Engineering (ICEE). IEEE, pp. 1–5 (2013)
K.O. Stanley, D.B. D’Ambrosio, J. Gauci, A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)
K.O. Stanley, R. Miikkulainen, Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
V. Stanovov, E. Semenkin, O. Semenkina, Instance selection approach for self-configuring hybrid fuzzy evolutionary algorithm for imbalanced datasets, in International Conference in Swarm Intelligence. Springer, pp. 451–459 (2015)
M. Suganuma, S. Shirakawa, T. Nagao, A genetic programming approach to designing convolutional neural network architectures, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO’17. ACM, New York, NY, USA, pp. 497–504 (2017). https://doi.org/10.1145/3071178.3071229
C. Thornton, F. Hutter, H.H. Hoos, K. Leyton-Brown, Auto-weka: Combined selection and hyperparameter optimization of classification algorithms, in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp. 847–855 (2013)
A.J. Turner, J.F. Miller, Cartesian genetic programming encoded artificial neural networks: a comparison using three benchmarks, in Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. ACM, pp. 1005–1012 (2013)
P. Verbancsics, J. Harguess, Image classification using generative neuro evolution for deep learning in 2015 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp. 488–493 (2015)
D. Whitley, T. Starkweather, C. Bogart, Genetic algorithms and neural networks: optimizing connections and connectivity. Parallel Comput. 14(3), 347–361 (1990)
I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Data Mining: Practical Machine Learning Tools And Techniques (Morgan Kaufmann, San Francisco, 2016)
H. Xiao, K. Rasul, R. Vollgraf, Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms (2017)
B. Xue, M. Zhang, W.N. Browne, X. Yao, A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)
X. Yao, Evolving artificial neural networks. Proc. IEEE 87(9), 1423–1447 (1999)
J. Yu, B. Bhanu, Evolutionary feature synthesis for facial expression recognition. Pattern Recognit. Lett. 27(11), 1289–1298 (2006)
Acknowledgements
This work is partially funded by: Fundação para a Ciência e Tecnologia (FCT), Portugal, under the Grant SFRH/BD/114865/2016. We would also like to thank NVIDIA for providing us Titan X GPUs.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
Rights and permissions
About this article
Cite this article
Assunção, F., Lourenço, N., Machado, P. et al. DENSER: deep evolutionary network structured representation. Genet Program Evolvable Mach 20, 5–35 (2019). https://doi.org/10.1007/s10710-018-9339-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10710-018-9339-y