Abstract
With the rapid development of convolutional neural network (CNN), the accuracy of CNN has been significantly improved, which also brings great challenges to the deployment of mobile terminals or embedded devices with limited resources. Recently, significant achievements have been made in compressing CNN through low-rank decomposition. Unlike existing methods that use the same decomposition form and decomposition strategy with fine-tuning based on singular value decomposition (SVD), our method uses different decomposition forms for different layers, and proposes decomposition strategies without fine-tuning. We present a simple and effective scheme to compress the entire CNN, which is called cosine similarity SVD without fine-tuning. For the AlexNet, our cosine similarity algorithm of rank selection takes 84% of the time to find the rank compared with the bayesian optimization (BayesOpt) algorithm. After we tested various CNNs (AlexNet, VGG-16, VGG-19, and ResNet-50) on different data sets, experimental results show that the weight parameter drop can exceed 50% when the accuracy loss is less than 1% without fine-tuning. The floating point operations (FLOPs) drop is about 20%, and the accuracy loss is less than 1% without fine-tuning.
Similar content being viewed by others
References
Astrid, M., Lee, S.: CP-decomposition with tensor power method for convolutional neural networks compression. In: IEEE International Conference on Big Data and Smart Computing, pp. 115–118 (2017). arXiv: 1701.07148
Chang, C., Kehtarnavaz, N.: Computationally efficient image deblurring using low rank image approximation and its GPU implementation. J. Real Time Image Process. 12(3), 567–573 (2016)
Cheng, J., Wang, P., Li, G., Hu, Q., Lu, H.: Recent advances in efficient computation of deep convolutional neural networks. Front. Inf. Technol. Electron. 19(1), 64–77 (2018)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: In Advances in Neural Information Processing Systems (NIPS), pp. 1269–1277. Montreal, Canada (2014)
Hillar, C.J., Lim, L.: Most tensor problems are NP-Hard. J. Assoc. Comput. Mach. 60(6), 45:1-45:39 (2013)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: British Machine Vision Conference (BMVC), Nottingham, UK (2014)
Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13(4), 455–492 (1998)
Kim, T., Lee, J., Choe, Y.: Bayesian optimization-based global optimal rank selection for compression of convolutional neural networks. IEEE Access 8, 17605–17618 (2020)
Kim, Y., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. In: International Conference on Learning Representations (ICLR), 2–4 May 2016
Jia, D., Wei, D., Socher, R., Li, L.J., Kai, L., Li, F.F. : ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: In Advances in Neural Information Processing Systems (NIPS), pp. 1106–1114 (2012)
Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I.V., Lempitsky, V.S.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In: International Conference on Learning Representations (ICLR), 7–9 May 2015
Li, G., Zhang, M., Zhang, Q., Lin, Z.: Efficient binary 3D convolutional neural network and hardware accelerator. J. Real Time Image Process. 19(1), 61–71 (2022)
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Adv. Neural Inf. Process. Syst. 234, 11–26 (2017)
Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., Ling, H.: CBNet: a novel composite backbone network architecture for object detection. In: the Association for the Advance of Artificial Intelligence (AAAI), pp. 11653–11660 (2020)
Marcel, S., Rodriguez, Y.: Torchvision the machine-vision package of torch. In: ACM Proceedings of the 18th International Conference on Multimedia, pp. 1485–1488. Firenze, Italy (2010)
Mockus, J.: On Bayesian methods for seeking the extremum. In: Marchuk G.I. (eds.) Optimization Techniques, IFIP Technical Conference, vol. 27, pp. 400–404. Springer, Novosibirsk, USSR (1974)
Mustafa, N.U., O’Riordan, M.J., Rogers, S., Ozturk, O.: Exploiting architectural features of a computer vision platform towards reducing memory stalls. J. Real Time Image Process. 17(4), 853–870 (2020)
Nakajima, S., Sugiyama, M., Babacan S.D.: Variational Bayesian sparse additive matrix factorization. Mach. Learn. 92(2–3), 319–347 (2013)
Nakajima, S., Tomioka, R., Sugiyama, M., Babacan, S.D.: Perfect dimensionality recovery by variational Bayesian PCA. In: Advances in Neural Information Processing Systems (NIPS), pp. 980–988 (2012)
Nie, Y., Ishii, I., Yamamoto, K., Orito, K., Matsuda, H.: Real-time scratching behavior quantification system for laboratory mice using high-speed vision. J. Real Time Image Process. 4(2), 181–190 (2009)
Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.P.: Tensorizing neural networks. In: In Advances in Neural Information Processing Systems, pp. 442–450 (2015)
Ramakrishnan, N., Wu, M., Lam, S.K., Srikanthan, T.: Enhanced low-complexity pruning for corner detection. J. Real Time Image Process. 12(1), 197–213 (2016)
Rasmussen, C.E.: Gaussian processes in machine learning. In: Advanced Lectures on Machine Learning, ML Summer Schools 2003, Canberra, Australia, vol. 3176, pp. 63–71. Springer (2003)
Russakovsky, O., Deng, J., Su, H., et.al.:ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 115(3):211–252 (2015)
Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems (NIPS), pp. 1257–1264. Vancouver, Canada (2007)
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: Advances in Neural Information Processing Systems (NIPS), pp. 11895–11907. Vancouver, Canada (2019)
Touvron, H., Vedaldi, A., Douze, M., Jégou, H.: Fixing the train-test resolution discrepancy. In: Advances in Neural Information Processing Systems (NIPS), pp. 8250–8260. Vancouver, Canada (2019)
Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W., Xiao, B.: Deep high-resolution representation learning for visual recognition. arXiv:1908.07919 [CoRR] (2019)
Wang, P., Cheng, J.: Accelerating convolutional neural networks for mobile applications. In: ACM International Conference on Multimedia, pp. 541–545 (2016)
Xu, X., Zhou, X., Venkatesan, R., Swaminathan, G., Majumder, O.: d-SNE: Domain adaptation using stochastic neighborhood embedding. In: IEEE Conf. Computer Vision and Pattern (CVPR), pp. 2497–2506. Long Beach, CA, USA, (2019)
Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems (NIPS), pp. 3320–3328. Montreal, Canada (2014)
Zhang, T., Zheng, W., Cui, Z., Zong, Y., Yan, J., Yan, K.: A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans. Multimed. 18(12), 2528–2536 (2016)
Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2016)
Zhang, X., Zou, J., Ming, X., He, K., Sun, J.: Efficient and accurate approximations of nonlinear convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1984–1992 (2015)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, M., Liu, F. & Weng, D. Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning. J Real-Time Image Proc 20, 64 (2023). https://doi.org/10.1007/s11554-023-01274-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11554-023-01274-y