Abstract
A greater demand for accuracy and performance in neural networks has led to deeper networks with a large number of parameters. Overfitting is a major problem for such deeper networks. Dropout is a popular regularization strategy used in deep neural networks to mitigate overfitting. However, dropout requires a hyperparameter to be chosen for every dropout layer. This process becomes tedious when the network has several dropout layers. In this paper, we introduce a method of sampling a dropout rate from an automatically determined distribution. We further build on this automatic selection of dropout rate by clustering the activations and adaptively applying different rates to each cluster. We have evaluated both our approaches using the CIFAR-10, CIFAR-100, and Fashion-MNIST datasets, using two state-of-the-art Wide ResNet variants as well as a simpler network. We show that our methods outperform standard dropout across all datasets and neural networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The “v3” network from https://github.com/jseppanen/cifa_lasagne.
References
Ba, J., Frey, B.: Adaptive dropout for training deep neural networks. In: Advances in Neural Information Processing Systems, pp. 3084–3092 (2013)
Gomez, A.N., Zhang, I., Swersky, K., Gal, Y., Hinton, G.E.: Targeted dropout (2018)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Huang, G., Sun, Yu., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 646–661. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_39
Keshari, R., Singh, R., Vatsa, M.: Guided dropout. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4065–4072 (2019)
Kingma, D.P., Salimans, T., Welling, M.: Variational dropout and the local reparameterization trick. In: Advances in Neural Information Processing Systems, pp. 2575–2583 (2015)
Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-10 and CIFAR-100 datasets. https://www.cs.toronto.edu/kriz/cifar.html (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Labach, A., Salehinejad, H., Valaee, S.: Survey of dropout methods for deep neural networks. arXiv preprint arXiv:1904.13310 (2019)
Mianjy, P., Arora, R., Vidal, R.: On the implicit bias of dropout. arXiv preprint arXiv:1806.09777 (2018)
Morerio, P., Cavazza, J., Volpi, R., Vidal, R., Murino, V.: Curriculum dropout. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3544–3552 (2017)
Raghu, M., Poole, B., Kleinberg, J., Ganguli, S., Dickstein, J.S.: On the expressive power of deep neural networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2847–2854. JMLR. org (2017)
Singh, S., Hoiem, D., Forsyth, D.: Swapout: learning an ensemble of deep architectures. In: Advances in Neural Information Processing Systems, pp. 28–36 (2016)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 648–656 (2015)
Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using DropConnect. In: International Conference on Machine Learning, pp. 1058–1066 (2013)
Wang, S., et al.: Analysis of deep neural networks with extended data Jacobian matrix. In: International Conference on Machine Learning, pp. 718–726 (2016)
Wang, S., Zhou, T., Bilmes, J.: JumpOut: improved dropout for deep neural networks with rectified linear units (2018)
Wang, S., Zhou, T., Bilmes, J.: Bias also matters: bias attribution for deep neural network explanation. In: International Conference on Machine Learning, pp. 6659–6667 (2019)
Wang, S., Manning, C.: Fast dropout training. In: International Conference on Machine Learning, pp. 118–126 (2013)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Zhuo, J., Zhu, J., Zhang, B.: Adaptive dropout rates for learning with corrupted features. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)
Zolna, K., Arpit, D., Suhubdy, D., Bengio, Y.: Fraternal dropout. arXiv preprint arXiv:1711.00066 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Dodballapur, V., Calisa, R., Song, Y., Cai, W. (2020). Automatic Dropout for Deep Neural Networks. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-63836-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)