CEG4N: Counter-Example Guided Neural Network Quantization Refinement

Matos, João Batista P.; Bessa, Iury; Manino, Edoardo; Song, Xidan; Cordeiro, Lucas C.

doi:10.1007/978-3-031-21222-2_3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13466))

Included in the following conference series:

271 Accesses
1 Citations

Abstract

Neural networks are essential components of learning-based software systems. However, deploying neural networks in low-resource domains is challenging because of their high computing, memory, and power requirements. For this reason, neural networks are often quantized before deployment, but existing quantization techniques tend to degrade network accuracy. We propose Counter-Example Guided Neural Network Quantization Refinement (CEG4N). This technique combines search-based quantization and equivalence verification: the former minimizes the computational requirements, while the latter guarantees that the network’s output does not change after quantization. We evaluate CEG4N on a diverse set of benchmarks, including large and small networks. Our technique successfully quantizes the networks in our evaluation while producing models with up to 72% better accuracy than state-of-the-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 7435; Price includes VAT (Japan)

Softcover Book: JPY 9294; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

QNNRepair: Quantized Neural Network Repair

DiverGet: a Search-Based Software Testing approach for Deep Neural Network Quantization assessment

Article 06 October 2022

AccMILP: An Approach for Accelerating Neural Network Verification Based on Neuron Importance

Notes

1.
The pre-trained weight for the ACAS Xu benchmarks can be found in the following repository: https://github.com/stanleybak/vnncomp2021.

References

Abate, A., et al.: Sound and automated synthesis of digital stabilizing controllers for continuous plants. In: Frehse, G., Mitra, S. (eds.) Proceedings of the 20th International Conference on Hybrid Systems: Computation and Control, HSCC 2017, Pittsburgh, PA, USA, 18–20 April 2017, pp. 197–206. ACM (2017). https://doi.org/10.1145/3049797.3049802
Abiodun, O.I., Jantan, A., Omolara, A.E., Dada, K.V., Mohamed, N.A., Arshad, H.: State-of-the-art in artificial neural network applications: a survey. Heliyon. 4(11), e00938 (2018). https://doi.org/10.1016/j.heliyon.2018.e00938, https://www.sciencedirect.com/science/article/pii/S2405844018332067
Albarghouthi, A.: Introduction to neural network verification. arXiv:2109.10317 (2021)
Bai, J., Lu, F., Zhang, K., et al.: ONNX: Open neural network exchange (2019). https://github.com/onnx/onnx
Bak, S., Liu, C., Johnson, T.: The second international verification of neural networks competition (VNN-COMP 2021): Summary and results (2021)
Google Scholar
Bojarski, M., et al.: End to end learning for self-driving cars. arXiv:1604.07316 (2016)
Kleine Büning, M., Kern, P., Sinz, C.: Verifying equivalence properties of neural networks with ReLU activation functions. In: Simonis, H. (ed.) CP 2020. LNCS, vol. 12333, pp. 868–884. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58475-7_50
Chapter Google Scholar
Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P.A., Łukasik, S., Żak, S.: Complete gradient clustering algorithm for features analysis of x-ray images. In: Piȩtka, E., Kawa, J. (eds) Information Technologies in Biomedicine. AINSC, vol. 69, pp. 15–24. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13105-9_2
Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv:1710.09282 (2017)
Eleftheriadis, C., Kekatos, N., Katsaros, P., Tripakis, S.: On neural network equivalence checking using SMT solvers. arXiv:2203.11629 (2022)
Esser, S.K., Appuswamy, R., Merolla, P., Arthur, J.V., Modha, D.S.: Backpropagation for energy-efficient neuromorphic computing. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015). https://proceedings.neurips.cc/paper/2015/file/10a5ab2db37feedfdeaab192ead4ac0e-Paper.pdf
Farabet, C., et al.: Large-scale FPGA-based convolutional networks (2011)
Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)
Article Google Scholar
Gadelha, M.R., Menezes, R.S., Cordeiro, L.C.: ESBMC 6.1: automated test case generation using bounded model checking. Int. J. Softw. Tools Technol. Transf. 23(6), 857–861 (2020). https://doi.org/10.1007/s10009-020-00571-2
Article Google Scholar
Gadelha, M.R., Monteiro, F.R., Morse, J., Cordeiro, L.C., Fischer, B., Nicole, D.A.: ESBMC 5.0: an industrial-strength C model checker. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 888–891 (2018). https://doi.org/10.1145/3238147.3240481
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv:2103.13630 (2022)
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. arXiv:1506.02626 (2015)
Hooker, S., Courville, A.C., Dauphin, Y., Frome, A.: Selective brain damage: measuring the disparate impact of model pruning. arXiv:1911.05248 (2019)
Huang, X., et al.: Safety and trustworthiness of deep neural networks: a survey. arXiv:1812.08342 (2018)
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. arXiv:1609.07061 (2017)
IEEE: IEEE standard for floating-point arithmetic. IEEE Std. 754–2019 (Revision of IEEE 754–2008), pp. 1–84 (2019). https://doi.org/10.1109/IEEESTD.2019.8766229
Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. CoRR abs/1712.05877 (2017). arxiv:1712.05877
Julian, K.D., Lopez, J., Brush, J.S., Owen, M.P., Kochenderfer, M.J.: Policy compression for aircraft collision avoidance systems. In: 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC), pp. 1–10 (2016). https://doi.org/10.1109/DASC.2016.7778091
Kirchhoffer, H., et al.: Overview of the neural network compression and representation (NNR) standard. IEEE Trans. Circ. Syst. Video Technol. 1 (2021). https://doi.org/10.1109/TCSVT.2021.3095970
Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. CoRR abs/1806.08342 (2018). arxiv:1806.08342
LeCun, Y., Cortes, C.: The MNIST database of handwritten digits (2005)
Google Scholar
Liu, C., Arnon, T., Lazarus, C., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4, 244–404 (2021)
Article Google Scholar
Narodytska, N., Kasiviswanathan, S.P., Ryzhyk, L., Sagiv, S., Walsh, T.: Verifying properties of binarized deep neural networks. In: AAAI (2018)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Teuber, S., Buning, M.K., Kern, P., Sinz, C.: Geometric path enumeration for equivalence verification of neural networks. In: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), November 2021. https://doi.org/10.1109/ictai52525.2021.00035, http://dx.doi.org/10.1109/ICTAI52525.2021.00035
Zhang, J., Zhou, Y., Saab, R.: Post-training quantization for neural networks with provable guarantees. arXiv preprint arXiv:2201.11113 (2022)

Download references

Acknowledgment

The work is partially funded by EPSRC grant EP/T026995/1 entitled “EnnCore: End-to-End Conceptual Guarding of Neural Architectures” under Security for all in an AI-enabled society.

Author information

Authors and Affiliations

Federal University of Amazonas, Manaus-AM, Brazil
João Batista P. Matos Jr., Iury Bessa & Lucas C. Cordeiro
Univeristy of Manchester, Machester, UK
Edoardo Manino, Xidan Song & Lucas C. Cordeiro

Authors

João Batista P. Matos Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Iury Bessa
View author publications
You can also search for this author in PubMed Google Scholar
Edoardo Manino
View author publications
You can also search for this author in PubMed Google Scholar
Xidan Song
View author publications
You can also search for this author in PubMed Google Scholar
Lucas C. Cordeiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João Batista P. Matos Jr. .

Editor information

Editors and Affiliations

The Hebrew University of Jerusalem, Jerusalem, Israel
Omri Isac
Rensselaer Polytechnic Institute, New York, NY, USA
Radoslav Ivanov
The Hebrew University of Jerusalem, Jerusalem, Israel
Guy Katz
Palo Alto, CA, USA
Nina Narodytska
University of Trieste, Trieste, Italy
Laura Nenzi

A Appendices

1.1 A.1 Implementation of NNs in Python.

The NNs were built and trained using the Pytorch library [29]. Weights of the trained models were then exported to the ONNX [4] format, which can be interpreted by Pytorch and used to run predictions without any compromise in the NNs performance.

1.2 A.2 Implementation of NNs abstract models in C.

In the present work, we use the C language to implement the abstract representation of the NNs. It allows us to explicitly model the NN operations in their original and quantized forms and apply existing software verification tools (e.g., ESBMC [14]). The operational C-abstraction models perform double-precision arithmetic. Although, we must notice that the original and quantized only diverge on the precision of the weight and bias vectors that are embedded in the abstractions code.

1.3 A.3 Encoding of Equivalence Properties

Suppose, a NN F, for which \(x\in \mathcal {H}\) is a safe input and \(y \in \mathcal {G}\) is the expected output of f the input. We now show how one can specify the equivalence properties. For this example, consider that the function f can produce the outputs of F in floating-point arithmetic, while fq produces the outputs of F in fixed-point arithmetic (i.e. quantization). First, the concrete NN input x is replaced by a non-deterministic one, which is achieved using the command nondet_float from the ESBMC.

1.4 A.4 Genetic Algorithm Parameters Definition

In Table 3, we report a summary of experiments conducted to tune the parameters of the Genetic Algorithm, more precisely, the number of generations. For example, a NN with 2 layers would require a brute force algorithm to search for \(52^2\) combinations of bits widths for the quantization. Similarly, a NN with 7 layers would require a brute force algorithm to search for \(52^7\) combinations of bits widths. We conducted a set of experiments where we ran the GA one hundred times with a different number of generations options ranging from 50 to 1000. In addition, we fixed the population size to 5. From our findings, the GA needs about 100 to 110 generations per layer to find the optimal bit width solution for each run.

Table 3. Summary of experiments for tuning Genetic Algorithm Parameters.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Matos, J.B.P., Bessa, I., Manino, E., Song, X., Cordeiro, L.C. (2022). CEG4N: Counter-Example Guided Neural Network Quantization Refinement. In: Isac, O., Ivanov, R., Katz, G., Narodytska, N., Nenzi, L. (eds) Software Verification and Formal Methods for ML-Enabled Autonomous Systems. NSV FoMLAS 2022 2022. Lecture Notes in Computer Science, vol 13466. Springer, Cham. https://doi.org/10.1007/978-3-031-21222-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-21222-2_3
Published: 16 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21221-5
Online ISBN: 978-3-031-21222-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics