Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning | Journal of Real-Time Image Processing Skip to main content
Log in

Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

With the rapid development of convolutional neural network (CNN), the accuracy of CNN has been significantly improved, which also brings great challenges to the deployment of mobile terminals or embedded devices with limited resources. Recently, significant achievements have been made in compressing CNN through low-rank decomposition. Unlike existing methods that use the same decomposition form and decomposition strategy with fine-tuning based on singular value decomposition (SVD), our method uses different decomposition forms for different layers, and proposes decomposition strategies without fine-tuning. We present a simple and effective scheme to compress the entire CNN, which is called cosine similarity SVD without fine-tuning. For the AlexNet, our cosine similarity algorithm of rank selection takes 84% of the time to find the rank compared with the bayesian optimization (BayesOpt) algorithm. After we tested various CNNs (AlexNet, VGG-16, VGG-19, and ResNet-50) on different data sets, experimental results show that the weight parameter drop can exceed 50% when the accuracy loss is less than 1% without fine-tuning. The floating point operations (FLOPs) drop is about 20%, and the accuracy loss is less than 1% without fine-tuning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Astrid, M., Lee, S.: CP-decomposition with tensor power method for convolutional neural networks compression. In: IEEE International Conference on Big Data and Smart Computing, pp. 115–118 (2017). arXiv: 1701.07148

  2. Chang, C., Kehtarnavaz, N.: Computationally efficient image deblurring using low rank image approximation and its GPU implementation. J. Real Time Image Process. 12(3), 567–573 (2016)

    Article  Google Scholar 

  3. Cheng, J., Wang, P., Li, G., Hu, Q., Lu, H.: Recent advances in efficient computation of deep convolutional neural networks. Front. Inf. Technol. Electron. 19(1), 64–77 (2018)

    Article  Google Scholar 

  4. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

  5. Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: In Advances in Neural Information Processing Systems (NIPS), pp. 1269–1277. Montreal, Canada (2014)

  6. Hillar, C.J., Lim, L.: Most tensor problems are NP-Hard. J. Assoc. Comput. Mach. 60(6), 45:1-45:39 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: British Machine Vision Conference (BMVC), Nottingham, UK (2014)

  8. Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13(4), 455–492 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  9. Kim, T., Lee, J., Choe, Y.: Bayesian optimization-based global optimal rank selection for compression of convolutional neural networks. IEEE Access 8, 17605–17618 (2020)

    Article  Google Scholar 

  10. Kim, Y., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. In: International Conference on Learning Representations (ICLR), 2–4 May 2016

  11. Jia, D., Wei, D., Socher, R., Li, L.J., Kai, L., Li, F.F. : ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)

  12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: In Advances in Neural Information Processing Systems (NIPS), pp. 1106–1114 (2012)

  13. Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I.V., Lempitsky, V.S.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In: International Conference on Learning Representations (ICLR), 7–9 May 2015

  14. Li, G., Zhang, M., Zhang, Q., Lin, Z.: Efficient binary 3D convolutional neural network and hardware accelerator. J. Real Time Image Process. 19(1), 61–71 (2022)

    Article  Google Scholar 

  15. Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Adv. Neural Inf. Process. Syst. 234, 11–26 (2017)

    Google Scholar 

  16. Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., Ling, H.: CBNet: a novel composite backbone network architecture for object detection. In: the Association for the Advance of Artificial Intelligence (AAAI), pp. 11653–11660 (2020)

  17. Marcel, S., Rodriguez, Y.: Torchvision the machine-vision package of torch. In: ACM Proceedings of the 18th International Conference on Multimedia, pp. 1485–1488. Firenze, Italy (2010)

  18. Mockus, J.: On Bayesian methods for seeking the extremum. In: Marchuk G.I. (eds.) Optimization Techniques, IFIP Technical Conference, vol. 27, pp. 400–404. Springer, Novosibirsk, USSR (1974)

  19. Mustafa, N.U., O’Riordan, M.J., Rogers, S., Ozturk, O.: Exploiting architectural features of a computer vision platform towards reducing memory stalls. J. Real Time Image Process. 17(4), 853–870 (2020)

    Article  Google Scholar 

  20. Nakajima, S., Sugiyama, M., Babacan S.D.: Variational Bayesian sparse additive matrix factorization. Mach. Learn. 92(2–3), 319–347 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  21. Nakajima, S., Tomioka, R., Sugiyama, M., Babacan, S.D.: Perfect dimensionality recovery by variational Bayesian PCA. In: Advances in Neural Information Processing Systems (NIPS), pp. 980–988 (2012)

  22. Nie, Y., Ishii, I., Yamamoto, K., Orito, K., Matsuda, H.: Real-time scratching behavior quantification system for laboratory mice using high-speed vision. J. Real Time Image Process. 4(2), 181–190 (2009)

    Article  Google Scholar 

  23. Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.P.: Tensorizing neural networks. In: In Advances in Neural Information Processing Systems, pp. 442–450 (2015)

  24. Ramakrishnan, N., Wu, M., Lam, S.K., Srikanthan, T.: Enhanced low-complexity pruning for corner detection. J. Real Time Image Process. 12(1), 197–213 (2016)

    Article  Google Scholar 

  25. Rasmussen, C.E.: Gaussian processes in machine learning. In: Advanced Lectures on Machine Learning, ML Summer Schools 2003, Canberra, Australia, vol. 3176, pp. 63–71. Springer (2003)

  26. Russakovsky, O., Deng, J., Su, H., et.al.:ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 115(3):211–252 (2015)

  27. Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems (NIPS), pp. 1257–1264. Vancouver, Canada (2007)

  28. Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: Advances in Neural Information Processing Systems (NIPS), pp. 11895–11907. Vancouver, Canada (2019)

  29. Touvron, H., Vedaldi, A., Douze, M., Jégou, H.: Fixing the train-test resolution discrepancy. In: Advances in Neural Information Processing Systems (NIPS), pp. 8250–8260. Vancouver, Canada (2019)

  30. Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W., Xiao, B.: Deep high-resolution representation learning for visual recognition. arXiv:1908.07919 [CoRR] (2019)

  31. Wang, P., Cheng, J.: Accelerating convolutional neural networks for mobile applications. In: ACM International Conference on Multimedia, pp. 541–545 (2016)

  32. Xu, X., Zhou, X., Venkatesan, R., Swaminathan, G., Majumder, O.: d-SNE: Domain adaptation using stochastic neighborhood embedding. In: IEEE Conf. Computer Vision and Pattern (CVPR), pp. 2497–2506. Long Beach, CA, USA, (2019)

  33. Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems (NIPS), pp. 3320–3328. Montreal, Canada (2014)

  34. Zhang, T., Zheng, W., Cui, Z., Zong, Y., Yan, J., Yan, K.: A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans. Multimed. 18(12), 2528–2536 (2016)

    Article  Google Scholar 

  35. Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2016)

    Article  Google Scholar 

  36. Zhang, X., Zou, J., Ming, X., He, K., Sun, J.: Efficient and accurate approximations of nonlinear convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1984–1992 (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, M., Liu, F. & Weng, D. Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning. J Real-Time Image Proc 20, 64 (2023). https://doi.org/10.1007/s11554-023-01274-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-023-01274-y

Keywords

Navigation