Designing and Proving Properties of the Abaco Autoscaler Using TLA+ | SpringerLink
Skip to main content

Designing and Proving Properties of the Abaco Autoscaler Using TLA+

  • Conference paper
  • First Online:
Software Verification (NSV 2021, VSTTE 2021)

Abstract

The Abaco (Actor Based Containers) platform is an open-source software system funded by the National Science Foundation and hosted at the Texas Advanced Computing Center, providing national-scale functions-as-a-service to the research computing community. Abaco utilizes the Actor Model of concurrent computation, where computational primitives, referred to as actors, execute in response to messages sent to the actor’s inbox. In this paper, we use formal methods to analyze Abaco and create an improved design which corrects a race condition in one of its critical subsystems. More precisely, we present a specification of an updated version of the autoscaler subsystem of Abaco, responsible for automatically scaling the number of worker processes associated with an actor based on the actor’s inbox size, using TLA+, a formal specification language for modeling concurrent systems. We analyze the new design using both the TLC model checker and the TLAPS proof system. We include results of our use of TLC for manually checking safety and liveness properties for some small state spaces, and we provide proofs in TLAPS of all safety properties. To the best of our knowledge, our work is the first analysis of a large, real-world production software system with open-source code, openly available TLA+ specification and complete TLAPS proofs of all key safety properties.

This material is based upon work supported by the National Science Foundation Office of Advanced CyberInfrastructure, Collaborative Proposal: Frameworks: Project Tapis: Next Generation Software for Distributed Research (Award #1931439), and SI2-SSE: Abaco - Flexible, Scalable, and Usable Functions-As-A-Service via the Actor Model (Award #1740288).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 6291
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7864
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chaudhuri, K., Doligez, D., Lamport, L., Merz, S.: Verifying safety properties with the TLA\(^{+}\) proof system. In: Giesl, J., Hähnle, R. (eds.) IJCAR 2010. LNCS (LNAI), vol. 6173, pp. 142–148. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14203-1_12

    Chapter  Google Scholar 

  2. Cousineau, D., Doligez, D., Lamport, L., Merz, S., Ricketts, D., Vanzetto, H.: TLA+ proofs. CoRR abs/1208.5933 (2012). http://arxiv.org/abs/1208.5933

  3. Davis, A.J.J., Hirschhorn, M., Schvimer, J.: Extreme modelling in practice. Proc. VLDB Endow. 13(9), 1346–1358 (2020). https://doi.org/10.14778/3397230.3397233

  4. Garcia, C., Stubbs, J., Looney, J., Jamthe, A., Packard, M., Nguyen, K.: The abaco platform: a performance and scalability study on the jetstream cloud. In: The 16th International Conference on Grid, Cloud, and Cluster Computing (GCC 2020), World Congress in Computer Science, Computer Engineering, and Applied Computing (CSCE) (July 2020)

    Google Scholar 

  5. Kubeless: Autoscaling function deployment in kubeless. https://kubeless.io/docs/autoscaling/. Accessed 20 May 2021

  6. Kuppe, M.A., Lamport, L., Ricketts, D.: The TLA+ toolbox. Electron. Proc. Theor. Comput. Sci. 310, 50–62 (2019)

    Article  Google Scholar 

  7. Lamport, L.: Industrial use of TLA+. https://lamport.azurewebsites.net/tla/industrial-use.html. Accessed 20 May 2021

  8. Lamport, L.: Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley Longman Publishing Co., Inc., USA (2002)

    Google Scholar 

  9. Lamport, L.: Using tlc to check inductive invariance (2018). https://lamport.azurewebsites.net/tla/inductive-invariant.pdf. Accessed 20 May 2021

  10. Lamport, L.: Proving safety properties (2019). https://lamport.azurewebsites.net/tla/proving-safety.pdf. Accessed 20 May 2021

  11. Microsoft-Inria: TLA+ proof system. https://tla.msr-inria.inria.fr/tlaps/content/Home.html. Accessed 20 May 2021

  12. Newcombe, C., Rath, T., Zhang, F., Munteanu, B., Brooker, M., Deardeuff, M.: How Amazon web services uses formal methods. Commun. ACM 58(4), 66–73 (2015)

    Article  Google Scholar 

  13. Padhy, S., Stubbs, J.: Abaco specification (2020). https://github.com/tapis-project/specifications/blob/master/generic-patterns/fmcad_abaco_proof.tla. Accessed 20 May 2021

  14. Peven, B.: Introducing instance refresh for ec2 auto scaling (2020). https://aws.amazon.com/blogs/compute/introducing-instance-refresh-for-ec2-auto-scaling/. Accessed 20 May 2021

  15. Stubbs, J., et al.: Enabling science with functions-as-a-service: new features and usage of the Abaco platform. Science Gateways Community Institute, Gateways (2020). osf.io/vd8am

    Google Scholar 

  16. Stubbs, J., Vaughn, M., Looney, J.: Rapid development of scalable, distributed computation with Abaco. In: Proceedings of the 10th International Workshop on Science Gateways, Edinburgh, Scotland, UK, 13–15 June 2018. CEUR Workshop Proceedings, vol. 2357. CEUR-WS.org (2018). http://ceur-ws.org/Vol-2357/paper3.pdf

  17. TLA+-Community: TLA+ examples. https://github.com/tlaplus/Examples. Accessed 20 May 2021

  18. TLAPS: Tlapm library. https://github.com/tlaplus/tlapm/tree/master/library. Accessed 20 May 2021

  19. Wayne, H.: Modeling zero-downtime deployments with TLA+ (May 2017). https://www.hillelwayne.com/modeling-deployments/

  20. Yu, Y., Manolios, P., Lamport, L.: Model checking TLA\(^{+}\) specifications. In: Pierre, L., Kropf, T. (eds.) CHARME 1999. LNCS, vol. 1703, pp. 54–66. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48153-2_6

    Chapter  Google Scholar 

Download references

Acknowledgment

The authors would like to thank Dr. Stephen Merz, Markus Kuppe, and members of TLA+ Google group for their feedback and help with the understanding of the examples TLA+ proof and TLAPS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Smruti Padhy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Padhy, S., Stubbs, J. (2022). Designing and Proving Properties of the Abaco Autoscaler Using TLA+. In: Bloem, R., Dimitrova, R., Fan, C., Sharygina, N. (eds) Software Verification. NSV VSTTE 2021 2021. Lecture Notes in Computer Science(), vol 13124. Springer, Cham. https://doi.org/10.1007/978-3-030-95561-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-95561-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-95560-1

  • Online ISBN: 978-3-030-95561-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics