Abstract
In artificial intelligence testing, there is an increased focus on enhancing the efficiency of test prioritization methods within deep learning systems. Subsequently, the DeepAbstraction algorithm has recently become one of the leading techniques in this area. It employs a box-abstraction concept, the efficiency of which depends on the tau parameter, the clustering parameter, that influences the size of these boxes. The conclusion of the previous experiments using tau values of 0.4 or 0.05 has failed to produce optimal results among all experiments. This highlights a significant challenge in the DeepAbstraction framework concerning the appropriate selection of the tau parameter. The selection of the tau value is extremely crucial, given its decisive effect on box size and, subsequently, the stability and efficacy of the framework. Addressing this challenge, we propose a methodology called combined parameterized boxes. This approach leverages the collective verdicts of monitors with various tau values to evaluate network predictions. We assign appropriate weights to these verdicts to ensure that no single verdict influences the decision-making process, thereby ensuring balance. Furthermore, we propose multiple strategies for integrating the weighted verdicts of monitors into a conclusive verdict, such as mean, max, product, and mode. The results of our investigation demonstrate that our approach can notably boost the DeepAbstraction framework’s performance. Compared to the leading algorithms, DeepAbstraction++ consistently outperforms its competitors, delivering an increase in performance between 2.38% and 7.71%. Additionally, DeepAbstraction++ brings remarkable stability to the process, addressing a significant shortcoming of the earlier version of DeepAbstraction.
This paper is supported by the European Horizon 2020 research and innovation programme under grant agreement No. 956123 and by the French National Research Agency (ANR) in the framework of the Investissements d’Avenir program (ANR-10-AIRT-05, irtnanoelec).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Al-Qadasi, H., Falcone, Y., Bensalem, S.: Difficulty and severity-oriented metrics for test prioritization in deep learning systems. In: 2023 IEEE International Conference on Artificial Intelligence Testing (AITest). IEEE (2023)
Al-Qadasi, H., Wu, C., Falcone, Y., Bensalem, S.: DeepAbstraction: 2-level prioritization for unlabeled test inputs in deep neural networks. In: 2022 IEEE International Conference On Artificial Intelligence Testing (AITest), pp. 64–71. IEEE (2022)
Byun, T., Sharma, V., Vijayakumar, A., Rayadurgam, S., Cofer, D.D.: Input prioritization for testing neural networks. CoRR arXiv:1901.03768(2019)
Chen, Y., Cheng, C.H., Yan, J., Yan, R.: Monitoring object detection abnormalities via data-label and post-algorithm abstractions. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6688–6693. IEEE (2021)
Dang, X., Li, Y., Papadakis, M., Klein, J., Bissyandé, T.F., Traon, Y.L.: Graphprior: mutation-based test input prioritization for graph neural networks. ACM Trans. Softw. Engi. Methodol. (2023)
Deng, L.: The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)
Feng, Y., Shi, Q., Gao, X., Wan, J., Fang, C., Chen, Z.: Deepgini: prioritizing massive tests to enhance the robustness of deep neural networks. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 177–188 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Henzinger, T.A., Lukina, A., Schilling, C.: Outside the box: abstraction-based monitoring of neural networks. In: ECAI, Frontiers in Artificial Intelligence and Applications, vol. 325, pp. 2433–2440. IOS Press (2020)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. University of Toronto (2009)
Kueffner, K., Lukina, A., Schilling, C., Henzinger, T.: Into the unknown: active monitoring of neural networks (extended version). Int. J. Softw. Tools Technol. Transfer (2023)
Li, Y., Li, M., Lai, Q., Liu, Y., Xu, Q.: Testrank: bringing order into unlabeled test instances for deep learning tasks. Adv. Neural. Inf. Process. Syst. 34, 20874–20886 (2021)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS (2011)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Shmelova, T., Yatsko, M., Sierostanov, I., Kolotusha, V.: Artificial intelligence methods and applications in aviation. In: Handbook of Research on AI Methods and Applications in Computer Engineering, pp. 108–140. IGI Global (2023)
Szegedy, C., et al.: Going deeper with convolutions. CoRR arXiv:1409.4842 (2014)
Tan, M., Le, Q.V.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (ICML) (2019)
Wang, Z., You, H., Chen, J., Zhang, Y., Dong, X., Zhang, W.: Prioritizing test inputs for deep neural networks via mutation analysis. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 397–409. IEEE (2021)
Wu, C., Falcone, Y., Bensalem, S.: Customizable reference runtime monitoring of neural networks using resolution boxes. CoRR arXiv:2104.14435 (2021)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. CoRR arXiv:1708.07747 (2017)
Yang, K., Tang, X., Qiu, S., Jin, S., Wei, Z., Wang, H.: Towards robust decision-making for autonomous driving on highway. IEEE Trans. Veh. Technol. (2023)
Ziegler, C.: A google self-driving car caused a crash for the first time (2016). https://www.theverge.com/2016/2/29/11134344/google-self-driving-car-crash-report. Accessed 27 July 2023
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Al-Qadasi, H., Falcone, Y., Bensalem, S. (2024). DeepAbstraction++: Enhancing Test Prioritization Performance via Combined Parameterized Boxes. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-46002-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)