Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning

Zhang, Meng; Liu, Fei; Weng, Dongpeng

doi:10.1007/s11554-023-01274-y

Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning

Original Research Paper
Published: 30 May 2023

Volume 20, article number 64, (2023)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

380 Accesses
Explore all metrics

Abstract

With the rapid development of convolutional neural network (CNN), the accuracy of CNN has been significantly improved, which also brings great challenges to the deployment of mobile terminals or embedded devices with limited resources. Recently, significant achievements have been made in compressing CNN through low-rank decomposition. Unlike existing methods that use the same decomposition form and decomposition strategy with fine-tuning based on singular value decomposition (SVD), our method uses different decomposition forms for different layers, and proposes decomposition strategies without fine-tuning. We present a simple and effective scheme to compress the entire CNN, which is called cosine similarity SVD without fine-tuning. For the AlexNet, our cosine similarity algorithm of rank selection takes 84% of the time to find the rank compared with the bayesian optimization (BayesOpt) algorithm. After we tested various CNNs (AlexNet, VGG-16, VGG-19, and ResNet-50) on different data sets, experimental results show that the weight parameter drop can exceed 50% when the accuracy loss is less than 1% without fine-tuning. The floating point operations (FLOPs) drop is about 20%, and the accuracy loss is less than 1% without fine-tuning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

An Analysis of Low-Rank Decomposition Selection for Deep Convolutional Neural Networks

Extreme Network Compression via Filter Group Approximation

Linear Regularized Compression of Deep Convolutional Neural Networks

References

Astrid, M., Lee, S.: CP-decomposition with tensor power method for convolutional neural networks compression. In: IEEE International Conference on Big Data and Smart Computing, pp. 115–118 (2017). arXiv: 1701.07148
Chang, C., Kehtarnavaz, N.: Computationally efficient image deblurring using low rank image approximation and its GPU implementation. J. Real Time Image Process. 12(3), 567–573 (2016)
Article Google Scholar
Cheng, J., Wang, P., Li, G., Hu, Q., Lu, H.: Recent advances in efficient computation of deep convolutional neural networks. Front. Inf. Technol. Electron. 19(1), 64–77 (2018)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: In Advances in Neural Information Processing Systems (NIPS), pp. 1269–1277. Montreal, Canada (2014)
Hillar, C.J., Lim, L.: Most tensor problems are NP-Hard. J. Assoc. Comput. Mach. 60(6), 45:1-45:39 (2013)
Article MathSciNet MATH Google Scholar
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: British Machine Vision Conference (BMVC), Nottingham, UK (2014)
Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13(4), 455–492 (1998)
Article MathSciNet MATH Google Scholar
Kim, T., Lee, J., Choe, Y.: Bayesian optimization-based global optimal rank selection for compression of convolutional neural networks. IEEE Access 8, 17605–17618 (2020)
Article Google Scholar
Kim, Y., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. In: International Conference on Learning Representations (ICLR), 2–4 May 2016
Jia, D., Wei, D., Socher, R., Li, L.J., Kai, L., Li, F.F. : ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: In Advances in Neural Information Processing Systems (NIPS), pp. 1106–1114 (2012)
Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I.V., Lempitsky, V.S.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In: International Conference on Learning Representations (ICLR), 7–9 May 2015
Li, G., Zhang, M., Zhang, Q., Lin, Z.: Efficient binary 3D convolutional neural network and hardware accelerator. J. Real Time Image Process. 19(1), 61–71 (2022)
Article Google Scholar
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Adv. Neural Inf. Process. Syst. 234, 11–26 (2017)
Google Scholar
Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., Ling, H.: CBNet: a novel composite backbone network architecture for object detection. In: the Association for the Advance of Artificial Intelligence (AAAI), pp. 11653–11660 (2020)
Marcel, S., Rodriguez, Y.: Torchvision the machine-vision package of torch. In: ACM Proceedings of the 18th International Conference on Multimedia, pp. 1485–1488. Firenze, Italy (2010)
Mockus, J.: On Bayesian methods for seeking the extremum. In: Marchuk G.I. (eds.) Optimization Techniques, IFIP Technical Conference, vol. 27, pp. 400–404. Springer, Novosibirsk, USSR (1974)
Mustafa, N.U., O’Riordan, M.J., Rogers, S., Ozturk, O.: Exploiting architectural features of a computer vision platform towards reducing memory stalls. J. Real Time Image Process. 17(4), 853–870 (2020)
Article Google Scholar
Nakajima, S., Sugiyama, M., Babacan S.D.: Variational Bayesian sparse additive matrix factorization. Mach. Learn. 92(2–3), 319–347 (2013)
Article MathSciNet MATH Google Scholar
Nakajima, S., Tomioka, R., Sugiyama, M., Babacan, S.D.: Perfect dimensionality recovery by variational Bayesian PCA. In: Advances in Neural Information Processing Systems (NIPS), pp. 980–988 (2012)
Nie, Y., Ishii, I., Yamamoto, K., Orito, K., Matsuda, H.: Real-time scratching behavior quantification system for laboratory mice using high-speed vision. J. Real Time Image Process. 4(2), 181–190 (2009)
Article Google Scholar
Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.P.: Tensorizing neural networks. In: In Advances in Neural Information Processing Systems, pp. 442–450 (2015)
Ramakrishnan, N., Wu, M., Lam, S.K., Srikanthan, T.: Enhanced low-complexity pruning for corner detection. J. Real Time Image Process. 12(1), 197–213 (2016)
Article Google Scholar
Rasmussen, C.E.: Gaussian processes in machine learning. In: Advanced Lectures on Machine Learning, ML Summer Schools 2003, Canberra, Australia, vol. 3176, pp. 63–71. Springer (2003)
Russakovsky, O., Deng, J., Su, H., et.al.:ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 115(3):211–252 (2015)
Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems (NIPS), pp. 1257–1264. Vancouver, Canada (2007)
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: Advances in Neural Information Processing Systems (NIPS), pp. 11895–11907. Vancouver, Canada (2019)
Touvron, H., Vedaldi, A., Douze, M., Jégou, H.: Fixing the train-test resolution discrepancy. In: Advances in Neural Information Processing Systems (NIPS), pp. 8250–8260. Vancouver, Canada (2019)
Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W., Xiao, B.: Deep high-resolution representation learning for visual recognition. arXiv:1908.07919 [CoRR] (2019)
Wang, P., Cheng, J.: Accelerating convolutional neural networks for mobile applications. In: ACM International Conference on Multimedia, pp. 541–545 (2016)
Xu, X., Zhou, X., Venkatesan, R., Swaminathan, G., Majumder, O.: d-SNE: Domain adaptation using stochastic neighborhood embedding. In: IEEE Conf. Computer Vision and Pattern (CVPR), pp. 2497–2506. Long Beach, CA, USA, (2019)
Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems (NIPS), pp. 3320–3328. Montreal, Canada (2014)
Zhang, T., Zheng, W., Cui, Z., Zong, Y., Yan, J., Yan, K.: A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans. Multimed. 18(12), 2528–2536 (2016)
Article Google Scholar
Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2016)
Article Google Scholar
Zhang, X., Zou, J., Ming, X., He, K., Sun, J.: Efficient and accurate approximations of nonlinear convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1984–1992 (2015)

Download references

Author information

Authors and Affiliations

National ASIC Research Center, School of Electronics Science and Engineering, Southeast University, Nanjing, 210096, China
Meng Zhang
School of Microelectronics, Southeast University, Nanjing, 210096, China
Fei Liu & Dongpeng Weng

Authors

Meng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Dongpeng Weng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, M., Liu, F. & Weng, D. Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning. J Real-Time Image Proc 20, 64 (2023). https://doi.org/10.1007/s11554-023-01274-y

Download citation

Received: 11 May 2022
Accepted: 04 November 2022
Published: 30 May 2023
DOI: https://doi.org/10.1007/s11554-023-01274-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Analysis of Low-Rank Decomposition Selection for Deep Convolutional Neural Networks

Extreme Network Compression via Filter Group Approximation

Linear Regularized Compression of Deep Convolutional Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Analysis of Low-Rank Decomposition Selection for Deep Convolutional Neural Networks

Extreme Network Compression via Filter Group Approximation

Linear Regularized Compression of Deep Convolutional Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation