Abstract
It is well known that the longest common factor (LCF) of two strings over an integer alphabet can be computed in time linear in the total length of the two strings. Our aim here is to present an algorithm that preprocesses two strings S and T in order to answer the following type of queries: Given a position i on S and a letter \(\alpha \), return an LCF of T and \(S'\), where \(S'\) is the string resulting from S after substituting S[i] with \(\alpha \). In what follows, we present an algorithm that, given two strings of length at most n, constructs in \(\mathcal {O}(n \log ^4 n)\) expected time a data structure of \(\mathcal {O}(n\log ^3 n)\) space that answers such queries in \(\mathcal {O}(\log ^3 n)\) time per query. After some trivial modifications, our approach can also support the case of single letter insertions or deletions in S.
A. Amir—Partially supported by the ISF grant 571/14 and the Royal Society.
C.S. Iliopoulos—Partially supported by the Onassis Foundation and the Royal Society.
J. Radoszewski—Supported by the “Algorithms for text processing with errors and uncertainties” project carried out within the HOMING programme of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Throughout the paper we assume that \(\log m\) denotes the binary logarithm of m rounded down to the nearest integer.
- 2.
Throughout the paper \(\tilde{\mathcal {O}}\) notation suppresses \(\log ^{\mathcal {O}(1)} n\) factors.
References
Alstrup, S., Brodal, G.S., Rauhe, T.: New data structures for orthogonal range searching. In: FOCS, pp. 198–207. IEEE Computer Society (2000)
Amir, A., Landau, G.M., Lewenstein, M., Sokol, D.: Dynamic text and static pattern matching. ACM Trans. Algorithms 3(2), 19 (2007)
Babenko, M.A., Starikovskaya, T.A.: Computing the longest common substring with one mismatch. Probl. Inf. Transm. 47(1), 28–33 (2011)
Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000). doi:10.1007/10719839_9
Chi, L., Hui, K.: Color set size problem with applications to string matching. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds.) CPM 1992. LNCS, vol. 644, pp. 230–243. Springer, Heidelberg (1992). doi:10.1007/3-540-56024-6_19
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press, Cambridge (2007)
Farach, M.: Optimal suffix tree construction with large alphabets. In: FOCS, pp. 137–143. IEEE Computer Society (1997)
Flouri, T., Giaquinta, E., Kobert, K., Ukkonen, E.: Longest common substrings with \(k\) mismatches. Inf. Process. Lett. 115(6–8), 643–647 (2015)
Fredman, M.L., Komlós, J., Szemerédi, E.: Storing a sparse table with O(1) worst case access time. J. ACM 31(3), 538–544 (1984)
Grabowski, S.: A note on the longest common substring with \(k\)-mismatches problem. Inf. Process. Lett. 115(6–8), 640–642 (2015)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)
Kociumaka, T., Starikovskaya, T., Vildhøj, H.W.: Sublinear space algorithms for the longest common substring problem. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, pp. 605–617. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44777-2_50
Leimeister, C., Morgenstern, B.: kmacs: the k-mismatch average common substring approach to alignment-free sequence comparison. Bioinformatics 30(14), 2000–2008 (2014)
Starikovskaya, T.: Longest common substring with approximately k mismatches. In: CPM. LIPIcs, vol. 54, pp. 21:1–21:11. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl (2016)
Starikovskaya, T., Vildhøj, H.W.: Time-Space trade-offs for the longest common substring problem. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 223–234. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38905-4_22
Thankachan, S.V., Apostolico, A., Aluru, S.: A provably efficient algorithm for the k-mismatch average common substring problem. J. Comput. Biol. 23(6), 472–482 (2016)
Thankachan, S.V., Chockalingam, S.P., Liu, Y., Apostolico, A., Aluru, S.: ALFRED: a practical method for alignment-free distance computation. J. Comput. Biol. 23(6), 452–460 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Amir, A., Charalampopoulos, P., Iliopoulos, C.S., Pissis, S.P., Radoszewski, J. (2017). Longest Common Factor After One Edit Operation. In: Fici, G., Sciortino, M., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2017. Lecture Notes in Computer Science(), vol 10508. Springer, Cham. https://doi.org/10.1007/978-3-319-67428-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-67428-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67427-8
Online ISBN: 978-3-319-67428-5
eBook Packages: Computer ScienceComputer Science (R0)