Squeezed Very Deep Convolutional Neural Networks for Text Classification

Duque, Andréa B.; Santos, Luã Lázaro J.; Macêdo, David; Zanchettin, Cleber

doi:10.1007/978-3-030-30487-4_16

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11727))

Included in the following conference series:

International Conference on Artificial Neural Networks

3178 Accesses
22 Citations
1 Altmetric

Abstract

Embedding artificial intelligence on constrained platforms has become a trend since the growth of embedded systems and mobile devices, experimented in recent years. Although constrained platforms do not have enough processing capabilities to train a sophisticated deep learning model, like convolutional neural networks (CNN), they are already capable of performing inference locally by using a previously trained embedded model. This approach enables numerous advantages such as privacy, response latency, and no real time network dependence. Still, the use of a local CNN model on constrained platforms is restricted by its storage size. Most of the research in CNNs has focused on increasing network depth to improve accuracy. In the text classification area, deep models were proposed with excellent performance but relying on large architectures with thousands of parameters, and consequently, high storage size. We propose to modify the structure of the Very Deep Convolutional Neural Networks (VDCNN) model to reduce its storage size while keeping the model performance. In this paper, we evaluate the impact of Temporal Depthwise Separable Convolutions and Global Average Pooling in the network parameters, storage size, dedicated hardware dependence, and accuracy. The proposed squeezed model (SVDCNN) is between 10x and 20x smaller than the original version, depending on the network depth, maintaining a maximum disk size of 6MB. Regarding accuracy, the network experiences a loss between 0.4% and 1.3% in the accuracy performance while obtains lower latency over non-dedicated hardware and higher inference time ratio compared to the baseline model.

A. B. Duque and L. L. J. Santos—Equally contributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reproducing the Sparse Huffman Address Map Compression for Deep Neural Networks

Coreset-Based Neural Network Compression

Energy Efficient LSTM Accelerator with e-FPGAs for XAI Based Text Classification

Notes

1.
Link: https://github.com/lazarotm/SVDCNN

References

Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1251–1258 (2017). https://doi.org/10.1109/cvpr.2017.195
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/e17-1104
Gong, Y., Liu, L., Yang, M., Bourdev, L.: Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016. https://doi.org/10.1109/cvpr.2016.90
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, July 2017. https://doi.org/10.1109/cvpr.2017.243
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: alexnet-level accuracy with 50x fewer parameters and \(<\)0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Kaiser, L., Gomez, A.N., Chollet, F.: Depthwise separable convolutions for neural machine translation. arXiv preprint arXiv:1706.03059 (2017)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Le, H.T., Cerisara, C., Denis, A.: Do convolutional networks need to be deep for text classification? In: The Workshops of the Thirty-Second AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Article Google Scholar
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
Santos, A.G., de Souza, C.O., Zanchettin, C., Macedo, D., Oliveira, A.L.I., Ludermir, T.: Reducing SqueezeNet storage size with depthwise separable convolutions. In: 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, July 2018. https://doi.org/10.1109/ijcnn.2018.8489442
Sifre, L., Mallat, S.: Rigid-motion scattering for image classification. Ph.D. thesis, Citeseer (2014)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sundermeyer, M., Ney, H., Schluter, R.: From feedforward to recurrent LSTM neural networks for language modeling. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 517–529 (2015). https://doi.org/10.1109/taslp.2015.2400218
Article Google Scholar
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics (2015). https://doi.org/10.3115/v1/p15-1150
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Zhou, C., Sun, C., Liu, Z., Lau, F.: A C-LSTM neural network for text classification. arXiv preprint arXiv:1511.08630 (2015)

Download references

Acknowledgment

We would like to thank CNPq and FACEPE (Brazilian research agencies) for the financial support.

Author information

Authors and Affiliations

Centro de Informática - CIn, Universidade Federal de Pernambuco, Recife, Brazil
Andréa B. Duque, Luã Lázaro J. Santos, David Macêdo & Cleber Zanchettin

Authors

Andréa B. Duque
View author publications
You can also search for this author in PubMed Google Scholar
Luã Lázaro J. Santos
View author publications
You can also search for this author in PubMed Google Scholar
David Macêdo
View author publications
You can also search for this author in PubMed Google Scholar
Cleber Zanchettin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andréa B. Duque .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duque, A.B., Santos, L.L.J., Macêdo, D., Zanchettin, C. (2019). Squeezed Very Deep Convolutional Neural Networks for Text Classification. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation. ICANN 2019. Lecture Notes in Computer Science(), vol 11727. Springer, Cham. https://doi.org/10.1007/978-3-030-30487-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-30487-4_16
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30486-7
Online ISBN: 978-3-030-30487-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Squeezed Very Deep Convolutional Neural Networks for Text Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Reproducing the Sparse Huffman Address Map Compression for Deep Neural Networks

Coreset-Based Neural Network Compression

Energy Efficient LSTM Accelerator with e-FPGAs for XAI Based Text Classification

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Squeezed Very Deep Convolutional Neural Networks for Text Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Reproducing the Sparse Huffman Address Map Compression for Deep Neural Networks

Coreset-Based Neural Network Compression

Energy Efficient LSTM Accelerator with e-FPGAs for XAI Based Text Classification

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation