Iterative Adaptation to Quantization Noise

Chudakov, Dmitry; Alyamkin, Sergey; Goncharenko, Alexander; Denisov, Andrey

doi:10.1007/978-3-030-85030-2_25

Dmitry Chudakov^11,12,
Sergey Alyamkin^11,12,
Alexander Goncharenko^11,12 &
…
Andrey Denisov^11,12

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12861))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

1297 Accesses

Abstract

Quantization allows accelerating neural networks significantly, especially for mobile processors. Existing quantization methods require either training neural network from scratch or gives significant accuracy drop for the quantized model. Low bits quantization (e.g., 4- or 6-bit) task is a much more resource consumptive problem in comparison with 8-bit quantization, it requires a significant amount of labeled training data. We propose a new low-bit quantization method for mobile neural network architectures that doesn’t require training from scratch and a big amount of train labeled data and allows to avoid significant accuracy drop.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning Non-uniform Step Sizes for Neural Network Quantization

Stochastic Markov gradient descent and training low-bit neural networks

Article 11 October 2021

On Practical Approach to Uniform Quantization of Non-redundant Neural Networks

References

Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713 (2018)
Google Scholar
Goncharenko, A., Denisov, A., Alyamkin, S., Terentev, E.: Fast adjustable threshold for uniform neural network quantization. Int. J. Comput. Inf. Eng. 13(9), 495–499 (2019)
Google Scholar
Banner, R., Nahshan, Y., Hoffer, E., Soudry, D.: Post-training 4-bit quantization of convolution networks for rapid-deployment. arXiv preprint arXiv:1810.05723 (2018)
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE, June 2009
Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR, May 2019
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kullback, S.: Information Theory and Statistics. John Riley and sons. Inc., New York (1959)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv preprint arXiv:1806.08342 (2018)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Migacz, S.: GPU Technology Conference, pp. 10–30 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Novosibirsk State University, Novosibirsk, Russia
Dmitry Chudakov, Sergey Alyamkin, Alexander Goncharenko & Andrey Denisov
Expasoft LLC, Novosibirsk, Russia
Dmitry Chudakov, Sergey Alyamkin, Alexander Goncharenko & Andrey Denisov

Authors

Dmitry Chudakov
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Alyamkin
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Goncharenko
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Denisov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Dmitry Chudakov , Sergey Alyamkin or Alexander Goncharenko .

Editor information

Editors and Affiliations

University of Granada, Granada, Spain
Ignacio Rojas
University of Málaga, Málaga, Spain
Gonzalo Joya
Technical University of Catalonia, Barcelona, Spain
Andreu Català

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chudakov, D., Alyamkin, S., Goncharenko, A., Denisov, A. (2021). Iterative Adaptation to Quantization Noise. In: Rojas, I., Joya, G., Català, A. (eds) Advances in Computational Intelligence. IWANN 2021. Lecture Notes in Computer Science(), vol 12861. Springer, Cham. https://doi.org/10.1007/978-3-030-85030-2_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-85030-2_25
Published: 21 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85029-6
Online ISBN: 978-3-030-85030-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics