Does Removing Pooling Layers from Convolutional Neural Networks Improve Results?

Santos, Claudio Filipi Goncalves dos; Moreira, Thierry Pinheiro; Colombo, Danilo; Papa, João Paulo

doi:10.1007/s42979-020-00295-9

Does Removing Pooling Layers from Convolutional Neural Networks Improve Results?

Original Research
Published: 19 August 2020

Volume 1, article number 275, (2020)
Cite this article

SN Computer Science Aims and scope Submit manuscript

Claudio Filipi Goncalves dos Santos ORCID: orcid.org/0000-0001-6580-5959¹,
Thierry Pinheiro Moreira²,
Danilo Colombo³ &
…
João Paulo Papa²

938 Accesses
9 Citations
Explore all metrics

Abstract

Due to their number of parameters, convolutional neural networks are known to take long training periods and extended inference time. Learning may take so much computational power that it requires a costly machine and, sometimes, weeks for training. In this context, there is a trend already in motion to replace convolutional pooling layers for a stride operation in the previous layer to save time. In this work, we evaluate the speedup of such an approach and how it trades off with accuracy loss in multiple computer vision domains, deep neural architectures, and datasets. The results showed significant acceleration with an almost negligible loss in accuracy, when any, which is a further indication that convolutional pooling on deep learning performs redundant calculations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Three Simple Approaches to Combining Neural Networks with Algorithms

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Deep Learning Architectures: A Hierarchy in Convolution Neural Network Technologies

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

http://rrc.cvc.uab.es/
We used the Python OPF implementation available at https://github.com/marcoscleison/PyOPF.

References

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G.S, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/. Accessed 5 May 2020.
Arganda-Carreras I, Turaga SC, Berger DR, Cireşan D, Giusti A, Gambardella LM, Schmidhuber J, Laptev D, Dwivedi S, Buhmann JM, et al. Crowdsourcing the creation of image segmentation algorithms for connectomics. Front Neuroanat. 2015;9:142.
Article Google Scholar
Cardona A, Saalfeld S, Preibisch S, Schmid B, Cheng A, Pulokas J, Tomancak P, Hartenstein V. An integrated micro-and macroarchitectural analysis of the drosophila brain by computer-assisted serial section electron microscopy. PLoS Biol. 2010;8(10):e1000502.
Article Google Scholar
Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. CoRR abs/1512.01274 (2015). http://arxiv.org/abs/1512.01274
Chollet F, et al. Keras. (2015). https://keras.io. Accessed 5 May 2020.
Cover TM, Hart PE, et al. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.
Article Google Scholar
DeVries T, Taylor G.W. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html. Accessed 5 May 2020.
Ghiasi G, Lin TY, Le QV. Dropblock: a regularization method for convolutional networks. Advances in neural information processing systems. Cambridge: MIT Press; 2018. p. 10727–37.
Google Scholar
Han J, Bhanu B. Individual recognition using gait energy image. IEEE Trans Pattern Anal Mach Intell. 2006;28(2):316–22.
Article Google Scholar
Harada T, Kuniyoshi Y. Graphical gaussian vector for image categorization. Advances in neural information processing systems. Cambridge: MIT Press; 2012. p. 1547–55.
Google Scholar
Hubel D, Wiesel T. Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. J Physiol. 1962;160:106–54.
Article Google Scholar
Hubel DH, Wiesel TN. Receptive fields of single neurons in the cat’s striate cortex. J Physiol. 1959;148:574–91.
Article Google Scholar
Hubel DH, Wiesel TN. Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat. J Neurophysiol. 1965;28(2):229–89.
Article Google Scholar
Iwama H, Okumura M, Makihara Y, Yagi Y. The ou-isir gait database comprising the large population dataset and performance evaluation of gait recognition. IEEE Trans Inf Forensics Secur. 2012;7(5):1511–21.
Article Google Scholar
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A. Reading text in the wild with convolutional neural networks. arXiv preprint arXiv:1412.1842 (2014)
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A. Synthetic data and artificial neural networks for natural scene text recognition. arXiv preprint arXiv:1406.2227 (2014)
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A. Reading text in the wild with convolutional neural networks. Int J Comput Vis. 2016;116(1):1–20.
Article MathSciNet Google Scholar
Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar V.R, Lu S. et al.: Icdar 2015 competition on robust reading. In: 2015 13th International conference on document analysis and recognition (ICDAR), IEEE, pp. 1156–1160 (2015)
Krizhevsky A, Nair V, Hinton G. Cifar-10 (canadian institute for advanced research) 2009. http://www.cs.toronto.edu/~kriz/cifar.html. Accessed 5 May 2020.
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. Cambridge: MIT Press; 2012. p. 1097–105.
Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.
Article Google Scholar
LeCun Y, Cortes C. MNIST handwritten digit database. 2010. http://yann.lecun.com/exdb/mnist/. Accessed 5 May 2020.
Lin M, Chen Q, Yan S. Network in network. CoRR abs/1312.4400 (2013). http://arxiv.org/abs/1312.4400
Lin T, Maire M, Belongie S.J, Bourdev L.D, Girshick R.B, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C.L. Microsoft COCO: common objects in context. CoRR abs/1405.0312 (2014). http://arxiv.org/abs/1405.0312
Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vis. 2004;60(2):91–110.
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529–33.
Article Google Scholar
Nagi J, Ducatelle F, Di Caro G.A, Cireşan D, Meier U, Giusti A, Nagi F, Schmidhuber J, Gambardella L.M. Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: Signal and Image Processing Applications (ICSIPA), 2011 IEEE International Conference on, pp. 342–347. IEEE (2011)
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A.Y. Reading digits in natural images with unsupervised feature learning (2011)
Papa JP, Falcão AX, Suzuki CTN. Supervised pattern classification based on optimum-path forest. Int J Imaging Syst Technol. 2009;19(2):120–31. https://doi.org/10.1002/ima.v19:2.
Article Google Scholar
Papa JP, Falcão AX, Albuquerque VHC, Tavares JMRS. Efficient supervised optimum-path forest classification for large datasets. Pattern Recogn. 2012;45(1):512–20.
Article Google Scholar
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S. Pytorch: an imperative style, high-performance deep learning library. Advances in neural Iinformation processing systems, vol. 32. New York: Curran Associates Inc; 2019. p. 8024–35.
Google Scholar
Redmon J, Divvala S.K, Girshick R.B, Farhadi A. You only look once: Unified, real-time object detection. CoRR abs/1506.02640 (2015). http://arxiv.org/abs/1506.02640
Redmon J, Farhadi A. YOLO9000: better, faster, stronger. CoRR abs/1612.08242 (2016). http://arxiv.org/abs/1612.08242
Romera E, Alvarez J.M, Bergasa L.M, Arroyo R. Efficient convnet for real-time semantic segmentation. In: IEEE Intelligent Vehicles Symposium (IV), pp. 1789–1794 (2017)
Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. CoRR abs/1505.04597 (2015)
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211–52. https://doi.org/10.1007/s11263-015-0816-y.
Article MathSciNet Google Scholar
Sánchez J, Perronnin F, Mensink T, Verbeek J. Image classification with the fisher vector: theory and practice. Int J Comput Vis. 2013;105(3):222–45.
Article MathSciNet Google Scholar
do Santos CFG, Colombo D, Roder M, Papa JP. Maxdropout: Deep neural network regularization based on maximum output values (2020). arXiv:2007.13723.
Santos CFG, Moreira TP, Colombo D, Papa JP. Does pooling really matter? an evaluation on gait recognition. In: Nyström A, Heredia YH, Núñez VM, editors. Progress in pattern recognition, image analysis, computer vision, and applications. Cham: Springer; 2019. p. 751–60.
Chapter Google Scholar
dos Santos CFG, Moreira TP, Colombo D, Papa JP. Does pooling really matter? an evaluation on gait recognition. Iberoamerican congress on pattern recognition. Berlin: Springer; 2019. p. 751–60.
Google Scholar
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Shahab A, Shafait F, Dengel A. Icdar 2011 robust reading competition challenge 2: Reading text in scene images. In: 2011 International conference on document analysis and recognition, IEEE, pp. 1491–1496 (2011)
Shiraga K, Makihara Y, Muramatsu D, Echigo T, Yagi Y. Geinet: View-invariant gait recognition using a convolutional neural network. In: 2016 International conference on biometrics (ICB), IEEE, pp. 1–8 (2016)
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Torralba A, Fergus R, Freeman WT. 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell. 2008;30(11):1958–70. https://doi.org/10.1109/TPAMI.2008.128.
Article Google Scholar
Wilcoxon F. Individual comparisons by ranking methods. Biometrics Bull. 1945;1(6):80–3.
Article Google Scholar
Zeiler M.D, Fergus R. Visualizing and understanding convolutional networks. In: European conference on computer vision, Springer, pp. 818–833 (2014)
Zhang Y, Gueguen L, Zharkov I, Zhang P, Seifert K, Kadlec B. Uber-text: A large-scale dataset for optical character recognition from street-level imagery. In: SUNw: Scene Understanding Workshop—CVPR 2017. Hawaii, USA (2017)
Zhong Z, Zheng L, Kang G, Li S, Yang Y. Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (2020)

Download references

Acknowledgements

The authors are grateful to Petrobras grant #2017/00285-6, FAPESP grants #2013/07375-0, #2014/12236-1, #2017/25908-6, #2018/15597-6, #2019/07665-4, as well as CNPq grants #307066/2017-7 and #427968/2018-6. On behalf of all authors, the corresponding author states that there is no conflict of interest.

Author information

Authors and Affiliations

UFSCar, Federal University of São Carlos, São Carlos, Brazil
Claudio Filipi Goncalves dos Santos
UNESP, State University of Sao Paulo, Bauru, Brazil
Thierry Pinheiro Moreira & João Paulo Papa
Cenpes, Petróleo Brasileiro S.A., Petrobras, Rio de Janeiro, RJ, Brazil
Danilo Colombo

Authors

Claudio Filipi Goncalves dos Santos
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Pinheiro Moreira
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Colombo
View author publications
You can also search for this author in PubMed Google Scholar
João Paulo Papa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Claudio Filipi Goncalves dos Santos.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Santos, C.F.G.d., Moreira, T.P., Colombo, D. et al. Does Removing Pooling Layers from Convolutional Neural Networks Improve Results?. SN COMPUT. SCI. 1, 275 (2020). https://doi.org/10.1007/s42979-020-00295-9

Download citation

Received: 22 June 2020
Accepted: 07 August 2020
Published: 19 August 2020
DOI: https://doi.org/10.1007/s42979-020-00295-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Does Removing Pooling Layers from Convolutional Neural Networks Improve Results?

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Three Simple Approaches to Combining Neural Networks with Algorithms

A review of convolutional neural networks in computer vision

Deep Learning Architectures: A Hierarchy in Convolution Neural Network Technologies

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Does Removing Pooling Layers from Convolutional Neural Networks Improve Results?

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Three Simple Approaches to Combining Neural Networks with Algorithms

A review of convolutional neural networks in computer vision

Deep Learning Architectures: A Hierarchy in Convolution Neural Network Technologies

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation