Reducing the Number of Multiplications in Convolutional Recurrent Neural Networks (ConvRNNs)

Vazhenina, Daria; Kanemura, Atsunori

doi:10.1007/978-3-030-39878-1_5

Daria Vazhenina²² &
Atsunori Kanemura²²

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1128))

Included in the following conference series:

Annual Conference of the Japanese Society for Artificial Intelligence

559 Accesses
2 Citations

Abstract

This is an extension from a selected paper from JSAI2019. Convolutional variants of recurrent neural networks, ConvRNNs, are widely used for spatio-temporal modeling, since they are well suited to model sequences of two-dimensional inputs. Similar to conventional RNNs, the introduction of gating architecture, such as ConvLSTM, brings additional parameters and increases the computational complexity. The computation load can be an obstacle in training efficient models and putting ConvRNNs in operation in real-world applications. However, the correspondence between ConvRNN unit complexity and its performance is not well investigated. We propose to reduce the number of parameters and multiplications by substituting some convolutional operations with the Hadamard product. We evaluate our proposal using the task of next video frame prediction and the Moving MNIST dataset. The proposed method requires 38% less multiplications and 21% less parameters compared to the fully convolutional counterpart. In price of the reduced computational complexity, the performance measured by structural similarity index measure (SSIM) decreased about 1.5%. ConvRNNs with reduced computations can be used in a wider range of situations like in web applications or embedded systems.

This paper is an extension of a selected paper from JSAI2019 [10].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 17159; Price includes VAT (Japan)

Softcover Book: JPY 21449; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gated Extra Memory Recurrent Unit for Learning Video Representations

Inception Recurrent Neural Network Architecture for Video Frame Prediction

Article 27 November 2022

A lightweight multi-granularity asymmetric motion mode video frame prediction algorithm

Article 16 March 2024

References

Ballas, N., Yao, L., Pal, C., Courville, A.: Delving deeper into convolutional networks for learning video representations. In: International Conference Learning Representations (ICLR) (2016)
Google Scholar
Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference Computer Vision and Pattern Recognition (CVPR), pp. 2625–2634 (2015)
Google Scholar
Elsayed, N., Maida, A.S., Bayoumi, M.: Reduced-gate convolutional LSTM using predictive coding for spatiotemporal prediction. arXiv:1810.07251 (2018)
Li, S., Li, W., Cook, C., Zhu, C., Gao, Y.: Independently recurrent neural network (IndRNN): building a longer and deeper RNN. In: IEEE Conference Computer Vision and Pattern Recognition (CVPR), pp. 5457–5466 (2018)
Google Scholar
Sautermeister, B.: Deep learning approaches to predict future frames in videos. Master’s thesis, Technishe Universität München (2016)
Google Scholar
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W., Woo, W.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems (NIPS), pp. 802–810 (2015)
Google Scholar
Shi, X., Gao, Z., Lausen, L., Wang, H., Yeung, D.Y., Wong, W., Woo, W.: Deep learning for precipitation nowcasting: a benchmark and a new model. In: Advances in Neural Information Processing Systems (NIPS), pp. 5617–5627 (2017)
Google Scholar
Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using LSTMs. In: International Conference Machine Learning (ICML), pp. 843–852 (2015)
Google Scholar
van der Westhuizen, J., Lasenby, J.: The unreasonable effectiveness of the forget gate. arXiv:1804.04849 (2018)
Vazhenina, D., Kanemura, A.: Reducing the number of multiplications in convolutional recurrent neural networks (ConvRNNs). In: Annual Conference of the Japanese Society for Artificial Intelligence (JSAI) (2019)
Google Scholar
Wang, Y., Gao, Z., Long, M., Wang, J., Yu, P.S.: PredRNN++: towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In: International Conference Machine Learning (ICML) (2018)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

LeapMind Inc., Tokyo, 150-0044, Japan
Daria Vazhenina & Atsunori Kanemura

Authors

Daria Vazhenina
View author publications
You can also search for this author in PubMed Google Scholar
Atsunori Kanemura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daria Vazhenina .

Editor information

Editors and Affiliations

Department of Systems Innovation, University of Tokyo, Tokyo, Japan
Yukio Ohsawa
Faculty of Business and Commerce, Kansai University, Osaka, Japan
Katsutoshi Yada
Nagoya Institute of Technology, Nagoya, Japan
Takayuki Ito
Graduate School of System Design, Tokyo Metropolitan University, Tokyo, Japan
Yasufumi Takama
Department of Information and Communication, Tokyo Metropolitan University, Tokyo, Japan
Eri Sato-Shimokawara
Faculty of Letters, Chiba University, Chiba, Japan
Akinori Abe
School of Engineering, The University of Tokyo, Tokyo, Japan
Junichiro Mori
Graduate School of Economics, Osaka University, Toyonaka, Osaka, Japan
Naohiro Matsumura

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vazhenina, D., Kanemura, A. (2020). Reducing the Number of Multiplications in Convolutional Recurrent Neural Networks (ConvRNNs). In: Ohsawa, Y., et al. Advances in Artificial Intelligence. JSAI 2019. Advances in Intelligent Systems and Computing, vol 1128. Springer, Cham. https://doi.org/10.1007/978-3-030-39878-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-39878-1_5
Published: 04 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39877-4
Online ISBN: 978-3-030-39878-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Reducing the Number of Multiplications in Convolutional Recurrent Neural Networks (ConvRNNs)

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gated Extra Memory Recurrent Unit for Learning Video Representations

Inception Recurrent Neural Network Architecture for Video Frame Prediction

A lightweight multi-granularity asymmetric motion mode video frame prediction algorithm

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Reducing the Number of Multiplications in Convolutional Recurrent Neural Networks (ConvRNNs)

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gated Extra Memory Recurrent Unit for Learning Video Representations

Inception Recurrent Neural Network Architecture for Video Frame Prediction

A lightweight multi-granularity asymmetric motion mode video frame prediction algorithm

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation