Abstract
Neural networks are essential components of learning-based software systems. However, deploying neural networks in low-resource domains is challenging because of their high computing, memory, and power requirements. For this reason, neural networks are often quantized before deployment, but existing quantization techniques tend to degrade network accuracy. We propose Counter-Example Guided Neural Network Quantization Refinement (CEG4N). This technique combines search-based quantization and equivalence verification: the former minimizes the computational requirements, while the latter guarantees that the network’s output does not change after quantization. We evaluate CEG4N on a diverse set of benchmarks, including large and small networks. Our technique successfully quantizes the networks in our evaluation while producing models with up to 72% better accuracy than state-of-the-art techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The pre-trained weight for the ACAS Xu benchmarks can be found in the following repository: https://github.com/stanleybak/vnncomp2021.
References
Abate, A., et al.: Sound and automated synthesis of digital stabilizing controllers for continuous plants. In: Frehse, G., Mitra, S. (eds.) Proceedings of the 20th International Conference on Hybrid Systems: Computation and Control, HSCC 2017, Pittsburgh, PA, USA, 18–20 April 2017, pp. 197–206. ACM (2017). https://doi.org/10.1145/3049797.3049802
Abiodun, O.I., Jantan, A., Omolara, A.E., Dada, K.V., Mohamed, N.A., Arshad, H.: State-of-the-art in artificial neural network applications: a survey. Heliyon. 4(11), e00938 (2018). https://doi.org/10.1016/j.heliyon.2018.e00938, https://www.sciencedirect.com/science/article/pii/S2405844018332067
Albarghouthi, A.: Introduction to neural network verification. arXiv:2109.10317 (2021)
Bai, J., Lu, F., Zhang, K., et al.: ONNX: Open neural network exchange (2019). https://github.com/onnx/onnx
Bak, S., Liu, C., Johnson, T.: The second international verification of neural networks competition (VNN-COMP 2021): Summary and results (2021)
Bojarski, M., et al.: End to end learning for self-driving cars. arXiv:1604.07316 (2016)
Kleine Büning, M., Kern, P., Sinz, C.: Verifying equivalence properties of neural networks with ReLU activation functions. In: Simonis, H. (ed.) CP 2020. LNCS, vol. 12333, pp. 868–884. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58475-7_50
Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P.A., Łukasik, S., Żak, S.: Complete gradient clustering algorithm for features analysis of x-ray images. In: Piȩtka, E., Kawa, J. (eds) Information Technologies in Biomedicine. AINSC, vol. 69, pp. 15–24. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13105-9_2
Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv:1710.09282 (2017)
Eleftheriadis, C., Kekatos, N., Katsaros, P., Tripakis, S.: On neural network equivalence checking using SMT solvers. arXiv:2203.11629 (2022)
Esser, S.K., Appuswamy, R., Merolla, P., Arthur, J.V., Modha, D.S.: Backpropagation for energy-efficient neuromorphic computing. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015). https://proceedings.neurips.cc/paper/2015/file/10a5ab2db37feedfdeaab192ead4ac0e-Paper.pdf
Farabet, C., et al.: Large-scale FPGA-based convolutional networks (2011)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)
Gadelha, M.R., Menezes, R.S., Cordeiro, L.C.: ESBMC 6.1: automated test case generation using bounded model checking. Int. J. Softw. Tools Technol. Transf. 23(6), 857–861 (2020). https://doi.org/10.1007/s10009-020-00571-2
Gadelha, M.R., Monteiro, F.R., Morse, J., Cordeiro, L.C., Fischer, B., Nicole, D.A.: ESBMC 5.0: an industrial-strength C model checker. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 888–891 (2018). https://doi.org/10.1145/3238147.3240481
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv:2103.13630 (2022)
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. arXiv:1506.02626 (2015)
Hooker, S., Courville, A.C., Dauphin, Y., Frome, A.: Selective brain damage: measuring the disparate impact of model pruning. arXiv:1911.05248 (2019)
Huang, X., et al.: Safety and trustworthiness of deep neural networks: a survey. arXiv:1812.08342 (2018)
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. arXiv:1609.07061 (2017)
IEEE: IEEE standard for floating-point arithmetic. IEEE Std. 754–2019 (Revision of IEEE 754–2008), pp. 1–84 (2019). https://doi.org/10.1109/IEEESTD.2019.8766229
Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. CoRR abs/1712.05877 (2017). arxiv:1712.05877
Julian, K.D., Lopez, J., Brush, J.S., Owen, M.P., Kochenderfer, M.J.: Policy compression for aircraft collision avoidance systems. In: 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC), pp. 1–10 (2016). https://doi.org/10.1109/DASC.2016.7778091
Kirchhoffer, H., et al.: Overview of the neural network compression and representation (NNR) standard. IEEE Trans. Circ. Syst. Video Technol. 1 (2021). https://doi.org/10.1109/TCSVT.2021.3095970
Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. CoRR abs/1806.08342 (2018). arxiv:1806.08342
LeCun, Y., Cortes, C.: The MNIST database of handwritten digits (2005)
Liu, C., Arnon, T., Lazarus, C., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4, 244–404 (2021)
Narodytska, N., Kasiviswanathan, S.P., Ryzhyk, L., Sagiv, S., Walsh, T.: Verifying properties of binarized deep neural networks. In: AAAI (2018)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Teuber, S., Buning, M.K., Kern, P., Sinz, C.: Geometric path enumeration for equivalence verification of neural networks. In: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), November 2021. https://doi.org/10.1109/ictai52525.2021.00035, http://dx.doi.org/10.1109/ICTAI52525.2021.00035
Zhang, J., Zhou, Y., Saab, R.: Post-training quantization for neural networks with provable guarantees. arXiv preprint arXiv:2201.11113 (2022)
Acknowledgment
The work is partially funded by EPSRC grant EP/T026995/1 entitled “EnnCore: End-to-End Conceptual Guarding of Neural Architectures” under Security for all in an AI-enabled society.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendices
A Appendices
1.1 A.1 Implementation of NNs in Python.
The NNs were built and trained using the Pytorch library [29]. Weights of the trained models were then exported to the ONNX [4] format, which can be interpreted by Pytorch and used to run predictions without any compromise in the NNs performance.
1.2 A.2 Implementation of NNs abstract models in C.
In the present work, we use the C language to implement the abstract representation of the NNs. It allows us to explicitly model the NN operations in their original and quantized forms and apply existing software verification tools (e.g., ESBMC [14]). The operational C-abstraction models perform double-precision arithmetic. Although, we must notice that the original and quantized only diverge on the precision of the weight and bias vectors that are embedded in the abstractions code.
1.3 A.3 Encoding of Equivalence Properties
Suppose, a NN F, for which \(x\in \mathcal {H}\) is a safe input and \(y \in \mathcal {G}\) is the expected output of f the input. We now show how one can specify the equivalence properties. For this example, consider that the function f can produce the outputs of F in floating-point arithmetic, while fq produces the outputs of F in fixed-point arithmetic (i.e. quantization). First, the concrete NN input x is replaced by a non-deterministic one, which is achieved using the command nondet_float from the ESBMC.
1.4 A.4 Genetic Algorithm Parameters Definition
In Table 3, we report a summary of experiments conducted to tune the parameters of the Genetic Algorithm, more precisely, the number of generations. For example, a NN with 2 layers would require a brute force algorithm to search for \(52^2\) combinations of bits widths for the quantization. Similarly, a NN with 7 layers would require a brute force algorithm to search for \(52^7\) combinations of bits widths. We conducted a set of experiments where we ran the GA one hundred times with a different number of generations options ranging from 50 to 1000. In addition, we fixed the population size to 5. From our findings, the GA needs about 100 to 110 generations per layer to find the optimal bit width solution for each run.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Matos, J.B.P., Bessa, I., Manino, E., Song, X., Cordeiro, L.C. (2022). CEG4N: Counter-Example Guided Neural Network Quantization Refinement. In: Isac, O., Ivanov, R., Katz, G., Narodytska, N., Nenzi, L. (eds) Software Verification and Formal Methods for ML-Enabled Autonomous Systems. NSV FoMLAS 2022 2022. Lecture Notes in Computer Science, vol 13466. Springer, Cham. https://doi.org/10.1007/978-3-031-21222-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-21222-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21221-5
Online ISBN: 978-3-031-21222-2
eBook Packages: Computer ScienceComputer Science (R0)