Abstract
Containerization, and container-based application orchestration and management - primarily using Kubernetes - are rapidly gaining popularity. Resilience in such environments is an increasingly critical aspect, especially in terms of fault recovery, as containerization-based microservices are becoming the de facto standard for soft real-time and cyber-physical workloads in edge computing.
The Worst Case Execution Time (WCET) of platform-supported recovery mechanisms is crucial for designing the resilience of applications, influencing, e.g., dimensioning and the design and parameterization of recovery policies.
However, due to the complexity of the underlying phenomena, establishing such WCET characteristics is generally feasible only empirically, carrying the risk of under- or overapproximating recovery time outliers, which, in turn, are crucial for assurance design.
Measurement-Based Probabilistic Timing Analysis (MBPTA) aims at estimating the Worst-Case Execution Time (WCET) based on measurements. A technique in the MBPTA “toolbox”, Extreme Value Analysis (EVA) is a statistical paradigm dealing with approximating the properties of extremely deviant values.
This paper demonstrates that container restarts, a key platform mechanism in Kubernetes, exhibits rare extreme execution time values. We also demonstrate that characterizing these rare values with EVA can lead to at least as good or better approximations as classic distribution fitting - and for the practice importantly, without distribution assumptions.
The results reported on in this paper partially rely on previous results of the EFOP-3.6.2-16-2017-00013 national project at the Budapest University of Technology and Economics and a joint research project with Ericsson.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agarwal, H., Sharma, A.: A comprehensive survey of Fault Tolerance techniques in Cloud Computing. In: 2015 International Conference on Computing and Network Communications (CoCoNet), pp. 408–413 (2015)
Avizienis, A., Laprie, J., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secure Comput. 1(1), 11–33 (2004)
Bernat, G., Colin, A., Petters, S.M.: WCET analysis of probabilistic hard real-time systems. In: 2002 23rd IEEE Real-Time Systems Symposium, RTSS 2002, pp. 279–288 (2002)
Castillo, E., Hadi, A., Balakrishnan, N., Sarabia, J.: Extreme Value and Related Models with Applications in Engineering and Science. Wiley, Hoboken (2004)
Cazorla, F.J., Kosmidis, L., Mezzetti, E., Hernandez, C., Abella, J., Vardanega, T.: Probabilistic worst-case timing analysis: taxonomy and comprehensive survey. ACM Comput. Surv. 52(1), 141–1435 (2019)
Cizek, P., Härdle, W.K., Weron, R. (eds.): Statistical Tools for Finance and Insurance, 2nd edn. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-18062-0
Cucu-Grosjean, L., et al.: Measurement-based probabilistic timing analysis for multi-path programs. In: 2012 24th Euromicro Conference on Real-Time Systems, pp. 91–101 (2012)
Cullmann, C., et al.: Predictability considerations in the design of multi-core embedded systems. In: Proceedings of Embedded Real Time Software and Systems. pp. 36–42 (2010)
ETSI: Network Functions Virtualisation: An Introduction, Benefits, Enablers, Challenges & Call for Action, Issue 1 (2012). https://portal.etsi.org/NFV/NFV_White_Paper.pdf. Accessed 10 July 2020
Hanmer, R.: Patterns for Fault Tolerant Software. Wiley, Hoboken (2013)
McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management: Concepts, Techniques and Tools - Revised Edition. Princeton University Press (2015)
Mell, P., Grance, T.: The NIST Definition of Cloud Computing. Technical report. NIST Special Publication (SP) 800-145, National Institute of Standards and Technology (2011). https://csrc.nist.gov/publications/detail/sp/800-145/final
Proartis: Probabilistically analysable real-time systems. https://www.proartis-project.eu/. Accessed 10 July 2020
Rakoncai, P.: On Modeling and Prediction of Multivariate Extremes. Ph.D. thesis, Mathematical Statistics Centre for Mathematical Sciences, Lund University (2009)
Rapitasystems: Rapitime product. https://www.rapitasystems.com/products/rapitime. Accessed 10 July 2020
Reghenzani, F., Massari, G., Fornaciari, W.: chronovise: measurement-based probabilistic timing analysis framework. J. Open Source Softw. 3(28), 711 (2018)
Wilhelm, R., et al.: The worst-case execution-time problem-overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7(3), 1–53 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Bozóki, S. et al. (2020). Application of Extreme Value Analysis for Characterizing the Execution Time of Resilience Supporting Mechanisms in Kubernetes. In: Bernardi, S., et al. Dependable Computing - EDCC 2020 Workshops. EDCC 2020. Communications in Computer and Information Science, vol 1279. Springer, Cham. https://doi.org/10.1007/978-3-030-58462-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-58462-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58461-0
Online ISBN: 978-3-030-58462-7
eBook Packages: Computer ScienceComputer Science (R0)