Sound Static Analysis of Regular Expressions for Vulnerabilities to Denial of Service Attacks | SpringerLink
Skip to main content

Sound Static Analysis of Regular Expressions for Vulnerabilities to Denial of Service Attacks

  • Conference paper
  • First Online:
Theoretical Aspects of Software Engineering (TASE 2022)

Abstract

Modern programming languages often provide functions to manipulate regular expressions in standard libraries. If they offer support for advanced features, the matching algorithm has an exponential worst-case time complexity: for some so-called vulnerable regular expressions, an attacker can craft ad hoc strings to force the matcher to exhibit an exponential behaviour and perform a Regular Expression Denial of Service (ReDoS) attack. In this paper, we introduce a framework based on a tree semantics to statically identify ReDoS vulnerabilities. In particular, we put forward an algorithm to extract an overapproximation of the set of words that are dangerous for a regular expression, effectively catching all possible attacks. We have implemented the analysis in a tool called rat, and testing it on a dataset of 74,670 regular expressions, we observed that in 99.47% of the instances the analysis terminates in less than one second. We compared rat to four other ReDoS detectors, and we found that our tool is faster, often by orders of magnitude, than most other tools. While raising a low number of false positives, rat is the only ReDoS detector that does not report false negatives.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9723
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12154
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    By choice we mean taking the left/right branch of an alternative or expanding/not expanding a star.

References

  1. The PyPI packet manager. https://pypi.org/. Accessed 09 May 2022

  2. Regexlib database. https://regexlib.com/. Accessed 09 May 2022

  3. Stack overflow outage postmortem (2016). https://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016. Accessed 09 May 2022

  4. Cloudflare’s outage postmortem (2019). https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/. Accessed 09 May 2022

  5. National vulnerability database: CVE-2020-3899 (2020). https://nvd.nist.gov/vuln/detail/CVE-2020-3899. Accessed 09 May 2022

  6. The snort database (2020). http://www.snort.org/. Accessed 09 May 2022

  7. Allauzen, C., Mohri, M., Rastogi, A.: General algorithms for testing the ambiguity of finite automata and the double-tape ambiguity of finite-state transducers. Int. J. Found. Comput. Sci. 22(04), 883–904 (2011). https://doi.org/10.1142/s0129054111008477

    Article  MathSciNet  MATH  Google Scholar 

  8. Becchi, M., Cadambi, S.: Memory-efficient regular expression search using state merging. In: Joint Conference of the IEEE Computer and Communications Societies, INFOCOM, pp. 1064–1072 (2007). https://doi.org/10.1109/INFCOM.2007.128

  9. Berglund, M., Drewes, F., van der Merwe, B.: Analyzing catastrophic backtracking behavior in practical regular expression matching. In: Automata and Formal Languages, AFL. EPTCS, vol. 151, pp. 109–123 (2014). https://doi.org/10.4204/EPTCS.151.7

  10. Berglund, M., van der Merwe, B.: On the semantics of regular expression parsing in the wild. Theor. Comput. Sci. 679, 69–82 (2017). https://doi.org/10.1016/j.tcs.2016.09.006

    Article  MathSciNet  MATH  Google Scholar 

  11. Chapman, C., Stolee, K.T.: Exploring regular expression usage and context in Python. In: International Symposium on Software Testing and Analysis, ISSTA, pp. 282–293. ACM (2016). https://doi.org/10.1145/2931037.2931073

  12. Cody-Kenny, B., Fenton, M., Ronayne, A., Considine, E., McGuire, T., O’Neill, M.: A search for improved performance in regular expressions. In: Genetic and Evolutionary Computation Conference, GECCO, pp. 1280–1287 (2017). https://doi.org/10.1145/3071178.3071196

  13. Crosby, S.A., Wallach, D.S.: Denial of service via algorithmic complexity attacks. In: USENIX Security Symposium. USENIX Association (2003). https://doi.org/10.1007/11506881_10

  14. Davis, J.C., Servant, F., Lee, D.: Using selective memoization to defeat regular expression denial of service (ReDoS). In: IEEE Symposium on Security and Privacy, SP, pp. 543–559. IEEE Computer Society (2021). https://doi.org/10.1109/SP40001.2021.00032

  15. Davis, J.C., Coghlan, C.A., Servant, F., Lee, D.: The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale. In: Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE, pp. 246–256. ACM (2018). https://doi.org/10.1145/3236024.3236027

  16. Friedl, J.E.F.: Mastering Regular Expressions - Understand Your Data and Be More Productive: For Perl, PHP, Java,NET, Ruby, and More, 3rd edn. O’Reilly (2006). https://www.oreilly.com/library/view/mastering-regular-expressions/0596528124/

  17. Glushkov, V.M.: The abstract theory of automata. Russ. Math. Surv. 16(5), 1 (1961)

    Article  MathSciNet  Google Scholar 

  18. Michael, L.G., Donohue, J., Davis, J.C., Lee, D., Servant, F.: Regexes are hard: decision-making, difficulties, and risks in programming regular expressions. In: International Conference on Automated Software Engineering, ASE, pp. 415–426. IEEE (2019). https://doi.org/10.1109/ASE.2019.00047

  19. Kirrage, J., Rathnayake, A., Thielecke, H.: Static analysis for regular expression denial-of-service attacks. In: Lopez, J., Huang, X., Sandhu, R. (eds.) NSS 2013. LNCS, vol. 7873, pp. 135–148. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38631-2_11

    Chapter  Google Scholar 

  20. Li, Y., et al.: FlashRegex: deducing anti-redos regexes from examples. In: International Conference on Automated Software Engineering, ASE 2020, pp. 659–671 (2020). https://doi.org/10.1145/3324884.3416556

  21. Lin, C., Liu, C., Chang, S.: Accelerating regular expression matching using hierarchical parallel machines on GPU. In: Global Communications Conference, GLOBECOM, pp. 1–5 (2011). https://doi.org/10.1109/GLOCOM.2011.6133663

  22. López, F., Romero, V.: Mastering Python Regular Expressions. Packt Publishing Ltd. (2014). https://www.packtpub.com/product/mastering-python-regular-expressions/9781783283156

  23. Owens, S., Reppy, J., Turon, A.: Regular-expression derivatives re-examined. J. Funct. Program. 19(2), 173–190 (2009). https://doi.org/10.1017/s0956796808007090

    Article  MathSciNet  MATH  Google Scholar 

  24. Pan, R., Hu, Q., Xu, G., D’Antoni, L.: Automatic repair of regular expressions. Proc. ACM Program. Lang. 3(OOPSLA), 139:1–139:29 (2019). https://doi.org/10.1145/3360565

  25. Parolini, F., Miné, A.: RAT - ReDoS Abstract Tester (2022). https://github.com/parof/rat

  26. Petsios, T., Zhao, J., Keromytis, A.D., Jana, S.: SlowFuzz: automated domain-independent detection of algorithmic complexity vulnerabilities. In: Conference on Computer and Communications Security, CCS, pp. 2155–2168. ACM (2017). https://doi.org/10.1145/3133956.3134073

  27. Rathnayake, A.: Semantics, analysis and security of backtracking regular expression matchers. Ph.D. thesis, University of Birmingham, UK (2015). http://etheses.bham.ac.uk/6011/

  28. Rathnayake, A., Thielecke, H.: Static analysis for regular expression exponential runtime via substructural logics. CoRR abs/1405.7058 (2014)

    Google Scholar 

  29. Shen, Y., Jiang, Y., Xu, C., Yu, P., Ma, X., Lu, J.: ReScue: crafting regular expression DoS attacks. In: International Conference on Automated Software Engineering, ASE, pp. 225–235. ACM (2018). https://doi.org/10.1145/3238147.3238159

  30. Staicu, C., Pradel, M.: Freezing the web: a study of ReDoS vulnerabilities in JavaScript-based web servers. In: USENIX Security Symposium, pp. 361–376. USENIX Association (2018)

    Google Scholar 

  31. Weber, A., Seidl, H.: On the degree of ambiguity of finite automata. Theor. Comput. Sci. 88(2), 325–349 (1991). https://doi.org/10.1016/0304-3975(91)90381-B

    Article  MathSciNet  MATH  Google Scholar 

  32. Weideman, N., van der Merwe, B., Berglund, M., Watson, B.: Analyzing matching time behavior of backtracking regular expression matchers by using ambiguity of NFA. In: Han, Y.-S., Salomaa, K. (eds.) CIAA 2016. LNCS, vol. 9705, pp. 322–334. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40946-7_27

    Chapter  MATH  Google Scholar 

  33. Wüstholz, V., Olivo, O., Heule, M.J.H., Dillig, I.: Static detection of DoS vulnerabilities in programs that use regular expressions. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 3–20. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54580-5_1

    Chapter  MATH  Google Scholar 

  34. Yu, X., Becchi, M.: GPU acceleration of regular expression matching for large datasets: exploring the implementation space. In: Computing Frontiers Conference, CF, pp. 18:1–18:10 (2013). https://doi.org/10.1145/2482767.2482791

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Parolini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Parolini, F., Miné, A. (2022). Sound Static Analysis of Regular Expressions for Vulnerabilities to Denial of Service Attacks. In: Aït-Ameur, Y., Crăciun, F. (eds) Theoretical Aspects of Software Engineering. TASE 2022. Lecture Notes in Computer Science, vol 13299. Springer, Cham. https://doi.org/10.1007/978-3-031-10363-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10363-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10362-9

  • Online ISBN: 978-3-031-10363-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics