Abstract
Regular expressions are often ambiguous. We present a novel method based on Brzozowski’s derivatives to aid the user in diagnosing ambiguous regular expressions. We introduce a derivative-based finite state transducer to generate parse trees and minimal counter-examples. The transducer can be easily customized to either follow the POSIX or Greedy disambiguation policy and based on a finite set of examples it is possible to examine if there are any differences among POSIX and Greedy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Additional arguments are x and r but we use notation \({ i njs}_{d_{x}(r)} \ \) to highlight that the definition is defined by pattern match over the various cases of the derivative operation.
- 3.
Technically, we treat the set of parse trees like a list. Recall that \({ a llEps}_{\cdot }\) and the simplification rule (Idemp) favor the left-most match. Alternatives keep their relative position in an expression.
References
Berry, G., Sethi, R.: From regular expressions to deterministic automata. Theoret. Comput. Sci. 48(1), 117–126 (1986). http://dl.acm.org/citation.cfm?id=39528.39537
Book, R., Even, S., Greibach, S., Ott, G.: Ambiguity in graphs and expressions. IEEE Trans. Comput. 20(2), 149–153 (1971). http://dx.doi.org/10.1109/T-C.1971.223204
Borsotti, A., Breveglieri, L., Crespi Reghizzi, S., Morzenti, A.: From ambiguous regular expressions to deterministic parsing automata. In: Drewes, F. (ed.) CIAA 2015. LNCS, vol. 9223, pp. 35–48. Springer, Heidelberg (2015)
Brabrand, C., Thomsen, J.G.: Typed and unambiguous pattern matching on strings using regular expressions. In: Proceedings of PPDP 2010, pp. 243–254. ACM (2010)
Brzozowski, J.A.: Derivatives of regular expressions. J. ACM 11(4), 481–494 (1964)
Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004)
Institute of Electrical, Electronics Engineers (IEEE): Standard for information technology - Portable Operating System Interface (POSIX) - Part 2 (Shell and utilities), Section 2.8 (Regular expression notation), IEEE Standard 1003.2, New York (1992)
McNaughton, R., Yamada, H.: Regular expressions and finite state graphs for automata. IRE Trans. Electron. Comput. EC 9(1), 38–47 (1960)
Okui, S., Suzuki, T.: Disambiguation in regular expression matching via position automata with augmented transitions. In: Domaratzki, M., Salomaa, K. (eds.) CIAA 2010. LNCS, vol. 6482, pp. 231–240. Springer, Heidelberg (2011)
PCRE - Perl Compatible Regular Expressions. http://www.pcre.org/
Sulzmann, M., Lu, K.Z.M.: Regular expression sub-matching using partial derivatives. In: Proceedings of PPDP 2012, pp. 79–90. ACM (2012)
Sulzmann, M., Lu, K.Z.M.: POSIX regular expression parsing with derivatives. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 203–220. Springer, Heidelberg (2014)
Watson, B.W.: A taxonomy of finite automata minimization algorithmes. Computing Science Note 93/44. Eindhoven University of Technology, The Netherlands (1993)
Acknowledgments
We thank Peter Thiemann and the reviewers for their comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Sulzmann, M., Lu, K.Z.M. (2016). Derivative-Based Diagnosis of Regular Expression Ambiguity. In: Han, YS., Salomaa, K. (eds) Implementation and Application of Automata. CIAA 2016. Lecture Notes in Computer Science(), vol 9705. Springer, Cham. https://doi.org/10.1007/978-3-319-40946-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-40946-7_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40945-0
Online ISBN: 978-3-319-40946-7
eBook Packages: Computer ScienceComputer Science (R0)