Automatic Dropout for Deep Neural Networks

Dodballapur, Veena; Calisa, Rajanish; Song, Yang; Cai, Weidong

doi:10.1007/978-3-030-63836-8_16

Veena Dodballapur¹⁴,
Rajanish Calisa¹⁵,
Yang Song¹⁶ &
…
Weidong Cai¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12534))

Included in the following conference series:

International Conference on Neural Information Processing

Abstract

A greater demand for accuracy and performance in neural networks has led to deeper networks with a large number of parameters. Overfitting is a major problem for such deeper networks. Dropout is a popular regularization strategy used in deep neural networks to mitigate overfitting. However, dropout requires a hyperparameter to be chosen for every dropout layer. This process becomes tedious when the network has several dropout layers. In this paper, we introduce a method of sampling a dropout rate from an automatically determined distribution. We further build on this automatic selection of dropout rate by clustering the activations and adaptively applying different rates to each cluster. We have evaluated both our approaches using the CIFAR-10, CIFAR-100, and Fashion-MNIST datasets, using two state-of-the-art Wide ResNet variants as well as a simpler network. We show that our methods outperform standard dropout across all datasets and neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Dropout drops double descent

Article 06 March 2024

Deep learning with ExtendeD Exponential Linear Unit (DELU)

Article 16 August 2023

A Random Focusing Method with Jensen–Shannon Divergence for Improving Deep Neural Network Performance Ensuring Architecture Consistency

Article Open access 17 June 2024

Notes

1.
The “v3” network from https://github.com/jseppanen/cifa_lasagne.

References

Ba, J., Frey, B.: Adaptive dropout for training deep neural networks. In: Advances in Neural Information Processing Systems, pp. 3084–3092 (2013)
Google Scholar
Gomez, A.N., Zhang, I., Swersky, K., Gal, Y., Hinton, G.E.: Targeted dropout (2018)
Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Huang, G., Sun, Yu., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 646–661. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_39
Chapter Google Scholar
Keshari, R., Singh, R., Vatsa, M.: Guided dropout. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4065–4072 (2019)
Google Scholar
Kingma, D.P., Salimans, T., Welling, M.: Variational dropout and the local reparameterization trick. In: Advances in Neural Information Processing Systems, pp. 2575–2583 (2015)
Google Scholar
Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-10 and CIFAR-100 datasets. https://www.cs.toronto.edu/kriz/cifar.html (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Labach, A., Salehinejad, H., Valaee, S.: Survey of dropout methods for deep neural networks. arXiv preprint arXiv:1904.13310 (2019)
Mianjy, P., Arora, R., Vidal, R.: On the implicit bias of dropout. arXiv preprint arXiv:1806.09777 (2018)
Morerio, P., Cavazza, J., Volpi, R., Vidal, R., Murino, V.: Curriculum dropout. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3544–3552 (2017)
Google Scholar
Raghu, M., Poole, B., Kleinberg, J., Ganguli, S., Dickstein, J.S.: On the expressive power of deep neural networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2847–2854. JMLR. org (2017)
Google Scholar
Singh, S., Hoiem, D., Forsyth, D.: Swapout: learning an ensemble of deep architectures. In: Advances in Neural Information Processing Systems, pp. 28–36 (2016)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 648–656 (2015)
Google Scholar
Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using DropConnect. In: International Conference on Machine Learning, pp. 1058–1066 (2013)
Google Scholar
Wang, S., et al.: Analysis of deep neural networks with extended data Jacobian matrix. In: International Conference on Machine Learning, pp. 718–726 (2016)
Google Scholar
Wang, S., Zhou, T., Bilmes, J.: JumpOut: improved dropout for deep neural networks with rectified linear units (2018)
Google Scholar
Wang, S., Zhou, T., Bilmes, J.: Bias also matters: bias attribution for deep neural network explanation. In: International Conference on Machine Learning, pp. 6659–6667 (2019)
Google Scholar
Wang, S., Manning, C.: Fast dropout training. In: International Conference on Machine Learning, pp. 118–126 (2013)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Zhuo, J., Zhu, J., Zhang, B.: Adaptive dropout rates for learning with corrupted features. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)
Google Scholar
Zolna, K., Arpit, D., Suhubdy, D., Bengio, Y.: Fraternal dropout. arXiv preprint arXiv:1711.00066 (2017)

Download references

Author information

Authors and Affiliations

School of Computer Science, University of Sydney, Sydney, Australia
Veena Dodballapur & Weidong Cai
School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, Australia
Rajanish Calisa
School of Computer Science and Engineering, University of New South Wales, Kensington, Australia
Yang Song

Authors

Veena Dodballapur
View author publications
You can also search for this author in PubMed Google Scholar
Rajanish Calisa
View author publications
You can also search for this author in PubMed Google Scholar
Yang Song
View author publications
You can also search for this author in PubMed Google Scholar
Weidong Cai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Song .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dodballapur, V., Calisa, R., Song, Y., Cai, W. (2020). Automatic Dropout for Deep Neural Networks. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-63836-8_16
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Dropout for Deep Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dropout drops double descent

Deep learning with ExtendeD Exponential Linear Unit (DELU)

A Random Focusing Method with Jensen–Shannon Divergence for Improving Deep Neural Network Performance Ensuring Architecture Consistency

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Automatic Dropout for Deep Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dropout drops double descent

Deep learning with ExtendeD Exponential Linear Unit (DELU)

A Random Focusing Method with Jensen–Shannon Divergence for Improving Deep Neural Network Performance Ensuring Architecture Consistency

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation