Abstract
We describe a way to compute the edit distance of two strings without having to fill the whole dynamic programming (DP) matrix, through a sequence of increasing guesses on the edit distance. If the strings share a certain degree of similarity, the edit distance can be quite smaller than the value of non-optimal solutions, and a large fraction (up to 80–90%) of the DP matrix cells do not need to be computed. Including the method’s overhead, this translates into a speedup factor from \(3\times \) up to \(30\times \) in the time needed to find the optimal solution for strings of length about 20,000.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput. 47(3), 1087–1097 (2018)
Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162(3), 705–708 (1982)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, 534 p. Cambridge University Press (1997)
Jones, N.C., Pevzner, P.A.: An Introduction to Bioinformatics Algorithms, 456 p. MIT Press (2004)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lancia, G., Dalpasso, M. (2019). Speeding-Up the Dynamic Programming Procedure for the Edit Distance of Two Strings. In: Anderst-Kotsis, G., et al. Database and Expert Systems Applications. DEXA 2019. Communications in Computer and Information Science, vol 1062. Springer, Cham. https://doi.org/10.1007/978-3-030-27684-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-27684-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27683-6
Online ISBN: 978-3-030-27684-3
eBook Packages: Computer ScienceComputer Science (R0)