Abstract
In modern programming languages, more and more functionalities, such as reflection and data interchange, rely on string values. String analysis statically computes the set of string values that are possibly assigned to a variable, and it involves a certain degree of approximation. During the last decade, several abstract domains approximating string values have been introduced and applied to statically analyze programs. However, most of them are not precise enough to track relational information between string variables whose value is statically unknown (e.g., user input), causing the loss of relevant knowledge about their possible values. This paper introduces a generic approach to formalize relational string abstract domains based on ordering relationships. We instantiate it to several domains built upon different well-known string orders (e.g., substring). We implemented the domain based on the substring ordering into a prototype static analyzer for Go, and we experimentally evaluated its precision and performance on some real-world case studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
secName is the result of a slight modification made to the function available at https://www.codota.com/code/java/classes/java.lang.String.
- 2.
Available at https://github.com/UniVE-SSV/go-lisa.
- 3.
In general, while \(\preceq \) (order on string variables) can be a pre or partial order, \(\sqsubseteq _ \mathcal {A}\) (order on the abstract domain \( \mathcal {A}\)) is always a partial order.
- 4.
Full details about how the constant propagation analysis works are reported in [32].
- 5.
maxLen(x) returns the maximum length of the string recognized by the automaton abstracting x if it is finite, \(+\infty \) otherwise.
References
Abdulla, P.A., Atig, M.F., Diep, B.P., Holík, L., Janků, P.: Chain-free string constraints. In: Chen, Y.-F., Cheng, C.-H., Esparza, J. (eds.) ATVA 2019. LNCS, vol. 11781, pp. 277–293. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31784-3_16
Amadini, R., et al.: Reference abstract domains and applications to string analysis. Fundam. Inform. 158(4), 297–326 (2018). https://doi.org/10.3233/FI-2018-1650
Amadini, R., Gange, G., Stuckey, P.J.: Dashed strings for string constraint solving. Artif. Intell. 289, 103368 (2020). https://doi.org/10.1016/j.artint.2020.103368
Amadini, R., et al.: Combining string abstract domains for JavaScript analysis: an evaluation. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10205, pp. 41–57. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54577-5_3
Arceri, V., Mastroeni, I.: An automata-based abstract semantics for string manipulation languages. In: Proceedings of VPT 2019. EPTCS, vol. 299, pp. 19–33 (2019). https://doi.org/10.4204/EPTCS.299.5
Arceri, V., Mastroeni, I.: Analyzing dynamic code: a sound abstract interpreter for evil eval. ACM Trans. Priv. Secur. 24(2), 10:1–10:38 (2021). https://doi.org/10.1145/3426470
Arceri, V., Mastroeni, I., Xu, S.: Static analysis for ECMAScript string manipulation programs. Appl. Sci. 10, 3525 (2020). https://doi.org/10.3390/app10103525
Arceri, V., Olliaro, M., Cortesi, A., Mastroeni, I.: Completeness of string analysis for dynamic languages. Inform. Comput. 104791 (2021). https://doi.org/10.1016/j.ic.2021.104791
Bagnara, R., Hill, P.M., Zaffanella, E.: The Parma Polyhedra library: toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Sci. Comput. Program. 72(1–2), 3–21 (2008). https://doi.org/10.1016/j.scico.2007.08.001
Bultan, T., Yu, F., Alkhalaf, M., Aydin, A.: String Analysis for Software Verification and Security. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-68670-7
Cortesi, A., Lauko, H., Olliaro, M., Ročkai, P.: String abstraction for model checking of C programs. In: Biondi, F., Given-Wilson, T., Legay, A. (eds.) SPIN 2019. LNCS, vol. 11636, pp. 74–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30923-7_5
Costantini, G., Ferrara, P., Cortesi, A.: Static analysis of string values. In: Qin, S., Qiu, Z. (eds.) ICFEM 2011. LNCS, vol. 6991, pp. 505–521. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24559-6_34
Costantini, G., Ferrara, P., Cortesi, A.: A suite of abstract domains for static analysis of string values. Softw. Pract. Exp. 45(2), 245–287 (2015). https://doi.org/10.1002/spe.2218
Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of POPL 1977, pp. 238–252 (1977). https://doi.org/10.1145/512950.512973
Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Aho, A.V., Zilles, S.N., Rosen, B.K. (eds.) Conference Record of the Sixth Annual ACM Symposium on Principles of Programming Languages, San Antonio, Texas, USA, January 1979, pp. 269–282. ACM Press (1979). https://doi.org/10.1145/567752.567778
Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Proceedings of POPL 1979, pp. 269–282 (1979). https://doi.org/10.1145/567752.567778
Cousot, P., Cousot, R.: Abstract interpretation and application to logic programs. J. Log. Program. 13(2 & 3), 103–179 (1992). https://doi.org/10.1016/0743-1066(92)90030-7
Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: Proceedings of POPL 1978, pp. 84–96 (1978). https://doi.org/10.1145/512760.512770
Ferrara, P., Logozzo, F., Fähndrich, M.: Safer unsafe code for.net. In: Proceedings of OOPSLA 2008, pp. 329–346. ACM (2008). https://doi.org/10.1145/1449764.1449791
Ferrara, P., Negrini, L., Arceri, V., Cortesi, A.: Static analysis for dummies: experiencing LiSA. In: Do, L.N.Q., Urban, C. (eds.) SOAP@PLDI 2021: Proceedings of the 10th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis, Virtual Event, Canada, 22 June 2021, pp. 1–6. ACM (2021). https://doi.org/10.1145/3460946.3464316
Gange, G., Navas, J.A., Schachte, P., Søndergaard, H., Stuckey, P.J.: An abstract domain of uninterpreted functions. In: Jobstmann, B., Leino, K.R.M. (eds.) VMCAI 2016. LNCS, vol. 9583, pp. 85–103. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49122-5_4
Gulwani, S., Tiwari, A.: Combining abstract interpreters. In: Proceedings of PLDI 2006, pp. 376–386 (2006). https://doi.org/10.1145/1133981.1134026
Illous, H., Lemerre, M., Rival, X.: A relational shape abstract domain. In: Proceedings of NFM 2017, pp. 212–229 (2017). https://doi.org/10.1007/978-3-319-57288-8_15
Jensen, S.H., Møller, A., Thiemann, P.: Type analysis for JavaScript. In: Palsberg, J., Su, Z. (eds.) SAS 2009. LNCS, vol. 5673, pp. 238–255. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03237-0_17
Kashyap, V., et al.: JSAI: a static analysis platform for JavaScript. In: Proceedings of FSE-22, pp. 121–132 (2014). https://doi.org/10.1145/2635868.2635904
Lee, H., Won, S., Jin, J., Cho, J., Ryu, S.: SAFE: formal specification and implementation of a scalable analysis framework for ECMAScript. In: Proceedings of FOOL 2012 (2012)
Logozzo, F.: Towards a quantitative estimation of abstract interpretations. In: Workshop on Quantitative Analysis of Software. Microsoft, June 2009. https://www.microsoft.com/en-us/research/publication/towards-a-quantitative-estimation-of-abstract-interpretations/
Logozzo, F., Fähndrich, M.: Pentagons: a weakly relational abstract domain for the efficient validation of array accesses. Sci. Comput. Program. 75(9), 796–807 (2010). https://doi.org/10.1016/j.scico.2009.04.004
Madsen, M., Andreasen, E.: String analysis for dynamic field access. In: Cohen, A. (ed.) CC 2014. LNCS, vol. 8409, pp. 197–217. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54807-9_12
Miné, A.: The octagon abstract domain. High. Order Symb. Comput. 19(1), 31–100 (2006). https://doi.org/10.1007/s10990-006-8609-1
Negrini, L., Arceri, V., Ferrara, P., Cortesi, A.: Twinning automata and regular expressions for string static analysis. In: Henglein, F., Shoham, S., Vizel, Y. (eds.) VMCAI 2021. LNCS, vol. 12597, pp. 267–290. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67067-2_13
Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03811-6
Park, C., Im, H., Ryu, S.: Precise and scalable static analysis of jQuery using a regular expression domain. In: Proceedings of DLS 2016, pp. 25–36 (2016). https://doi.org/10.1145/2989225.2989228
Seidl, H., Wilhelm, R., Hack, S.: Compiler Design - Analysis and Transformation. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-17548-0
Wang, H., Chen, S., Yu, F., Jiang, J.R.: A symbolic model checking approach to the analysis of string and length constraints. In: Proceedings of ASE 2018, pp. 623–633. ACM (2018). https://doi.org/10.1145/3238147.3238189
Wilhelm, R., Sagiv, S., Reps, T.W.: Shape analysis. In: Proceedings of CC 2000, pp. 1–17 (2000). https://doi.org/10.1007/3-540-46423-9_1
Yu, F., Alkhalaf, M., Bultan, T., Ibarra, O.H.: Automata-based symbolic string analysis for vulnerability detection. Formal Methods Syst. Design 44(1), 44–70 (2013). https://doi.org/10.1007/s10703-013-0189-1
Yu, F., Bultan, T., Hardekopf, B.: String abstractions for string verification. In: Groce, A., Musuvathi, M. (eds.) SPIN 2011. LNCS, vol. 6823, pp. 20–37. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22306-8_3
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Arceri, V., Olliaro, M., Cortesi, A., Ferrara, P. (2022). Relational String Abstract Domains. In: Finkbeiner, B., Wies, T. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2022. Lecture Notes in Computer Science(), vol 13182. Springer, Cham. https://doi.org/10.1007/978-3-030-94583-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-94583-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94582-4
Online ISBN: 978-3-030-94583-1
eBook Packages: Computer ScienceComputer Science (R0)