Abstract
The Abaco (Actor Based Containers) platform is an open-source software system funded by the National Science Foundation and hosted at the Texas Advanced Computing Center, providing national-scale functions-as-a-service to the research computing community. Abaco utilizes the Actor Model of concurrent computation, where computational primitives, referred to as actors, execute in response to messages sent to the actor’s inbox. In this paper, we use formal methods to analyze Abaco and create an improved design which corrects a race condition in one of its critical subsystems. More precisely, we present a specification of an updated version of the autoscaler subsystem of Abaco, responsible for automatically scaling the number of worker processes associated with an actor based on the actor’s inbox size, using TLA+, a formal specification language for modeling concurrent systems. We analyze the new design using both the TLC model checker and the TLAPS proof system. We include results of our use of TLC for manually checking safety and liveness properties for some small state spaces, and we provide proofs in TLAPS of all safety properties. To the best of our knowledge, our work is the first analysis of a large, real-world production software system with open-source code, openly available TLA+ specification and complete TLAPS proofs of all key safety properties.
This material is based upon work supported by the National Science Foundation Office of Advanced CyberInfrastructure, Collaborative Proposal: Frameworks: Project Tapis: Next Generation Software for Distributed Research (Award #1931439), and SI2-SSE: Abaco - Flexible, Scalable, and Usable Functions-As-A-Service via the Actor Model (Award #1740288).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chaudhuri, K., Doligez, D., Lamport, L., Merz, S.: Verifying safety properties with the TLA\(^{+}\) proof system. In: Giesl, J., Hähnle, R. (eds.) IJCAR 2010. LNCS (LNAI), vol. 6173, pp. 142–148. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14203-1_12
Cousineau, D., Doligez, D., Lamport, L., Merz, S., Ricketts, D., Vanzetto, H.: TLA+ proofs. CoRR abs/1208.5933 (2012). http://arxiv.org/abs/1208.5933
Davis, A.J.J., Hirschhorn, M., Schvimer, J.: Extreme modelling in practice. Proc. VLDB Endow. 13(9), 1346–1358 (2020). https://doi.org/10.14778/3397230.3397233
Garcia, C., Stubbs, J., Looney, J., Jamthe, A., Packard, M., Nguyen, K.: The abaco platform: a performance and scalability study on the jetstream cloud. In: The 16th International Conference on Grid, Cloud, and Cluster Computing (GCC 2020), World Congress in Computer Science, Computer Engineering, and Applied Computing (CSCE) (July 2020)
Kubeless: Autoscaling function deployment in kubeless. https://kubeless.io/docs/autoscaling/. Accessed 20 May 2021
Kuppe, M.A., Lamport, L., Ricketts, D.: The TLA+ toolbox. Electron. Proc. Theor. Comput. Sci. 310, 50–62 (2019)
Lamport, L.: Industrial use of TLA+. https://lamport.azurewebsites.net/tla/industrial-use.html. Accessed 20 May 2021
Lamport, L.: Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley Longman Publishing Co., Inc., USA (2002)
Lamport, L.: Using tlc to check inductive invariance (2018). https://lamport.azurewebsites.net/tla/inductive-invariant.pdf. Accessed 20 May 2021
Lamport, L.: Proving safety properties (2019). https://lamport.azurewebsites.net/tla/proving-safety.pdf. Accessed 20 May 2021
Microsoft-Inria: TLA+ proof system. https://tla.msr-inria.inria.fr/tlaps/content/Home.html. Accessed 20 May 2021
Newcombe, C., Rath, T., Zhang, F., Munteanu, B., Brooker, M., Deardeuff, M.: How Amazon web services uses formal methods. Commun. ACM 58(4), 66–73 (2015)
Padhy, S., Stubbs, J.: Abaco specification (2020). https://github.com/tapis-project/specifications/blob/master/generic-patterns/fmcad_abaco_proof.tla. Accessed 20 May 2021
Peven, B.: Introducing instance refresh for ec2 auto scaling (2020). https://aws.amazon.com/blogs/compute/introducing-instance-refresh-for-ec2-auto-scaling/. Accessed 20 May 2021
Stubbs, J., et al.: Enabling science with functions-as-a-service: new features and usage of the Abaco platform. Science Gateways Community Institute, Gateways (2020). osf.io/vd8am
Stubbs, J., Vaughn, M., Looney, J.: Rapid development of scalable, distributed computation with Abaco. In: Proceedings of the 10th International Workshop on Science Gateways, Edinburgh, Scotland, UK, 13–15 June 2018. CEUR Workshop Proceedings, vol. 2357. CEUR-WS.org (2018). http://ceur-ws.org/Vol-2357/paper3.pdf
TLA+-Community: TLA+ examples. https://github.com/tlaplus/Examples. Accessed 20 May 2021
TLAPS: Tlapm library. https://github.com/tlaplus/tlapm/tree/master/library. Accessed 20 May 2021
Wayne, H.: Modeling zero-downtime deployments with TLA+ (May 2017). https://www.hillelwayne.com/modeling-deployments/
Yu, Y., Manolios, P., Lamport, L.: Model checking TLA\(^{+}\) specifications. In: Pierre, L., Kropf, T. (eds.) CHARME 1999. LNCS, vol. 1703, pp. 54–66. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48153-2_6
Acknowledgment
The authors would like to thank Dr. Stephen Merz, Markus Kuppe, and members of TLA+ Google group for their feedback and help with the understanding of the examples TLA+ proof and TLAPS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Padhy, S., Stubbs, J. (2022). Designing and Proving Properties of the Abaco Autoscaler Using TLA+. In: Bloem, R., Dimitrova, R., Fan, C., Sharygina, N. (eds) Software Verification. NSV VSTTE 2021 2021. Lecture Notes in Computer Science(), vol 13124. Springer, Cham. https://doi.org/10.1007/978-3-030-95561-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-95561-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95560-1
Online ISBN: 978-3-030-95561-8
eBook Packages: Computer ScienceComputer Science (R0)