New Error Tolerant Method for Search of Long Repeats in DNA Sequences | SpringerLink
Skip to main content

New Error Tolerant Method for Search of Long Repeats in DNA Sequences

  • Conference paper
  • First Online:
Algorithms for Computational Biology (AlCoB 2016)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9702))

Included in the following conference series:

Abstract

A new method to identify all sufficiently long repeating nucleotide substrings in one or several DNA sequences is proposed. The method based on a specific gauge applied to DNA sequences that guarantees identification of the repeating substrings. The method allows the matching substrings to contain a given level of errors. The gauge is based on the development of a heavily sparse dictionary of repeats, thus drastically accelerating the search procedure. Some biological applications illustrate the method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

References

  1. Altschul, S.F., Gish, W., Miller, W., Meyers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)

    Article  Google Scholar 

  2. Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: SPIRE 2000, pp. 39–48. IEEE Computer Society (2000)

    Google Scholar 

  3. Chen, G.L., Chang, Y.J., Hsueh, C.H.: PRAP: an ab initio software package for automated genome-wide analysis of DNA repeats for prokaryotes. Bioinformatics. 29(21), 2683–2689 (2013)

    Article  Google Scholar 

  4. Erciyes, K.: Distributed and Sequential Algorithms for Bioinformatics. Computational Biology, vol. 23, p. 367. Springer, Cham (2015)

    MATH  Google Scholar 

  5. Girgis, H.Z.: Red: an intelligent, rapid, accurate tool for detecting repeats de-novo on the genomic scale. BMC Bioinform. 16, 227 (2015)

    Article  Google Scholar 

  6. Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Commun. ACM 18(6), 341–343 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  7. Lian, S., Chen, X., Wang, P., Zhang, X., Dai, X.: A complete and accurate Ab initio repeat finding algorithm. Interdiscip. Sci. 8(1), 75–83 (2016)

    Article  Google Scholar 

  8. Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM 25(2), 322–336 (1978). ACM Press

    Article  MathSciNet  MATH  Google Scholar 

  9. Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. J. Comput. Syst. Sci. 20(1), 18–31 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  10. Novák, P., Neumann, P., Pech, J., Steinhaisl, J., Macas, J.: RepeatExplorer: a galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics 29(6), 792–793 (2013)

    Article  Google Scholar 

  11. Pearson, W., Lipman, D.: Improved tools for biological sequence comparison. PNAS USA 85, 2444–2448 (1988)

    Article  Google Scholar 

  12. Sadovsky, M.G.: Information capacity of nucleotide sequences and its applications. Bull. Math. Biology. 68, 156 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  13. https://en.wikipedia.org/wiki/Vernier_scale

Download references

Acknowledgments

This study was supported by a research grant No. 14.Y26.31.0004 from the Government of the Russian Federation (M.G. Sadovsky) and the grant from Russian Ministry of Education and Science to Siberian Federal University, contract No. 1.1462.2014/K (S.P. Tsarev).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael G. Sadovsky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Tsarev, S.P., Sadovsky, M.G. (2016). New Error Tolerant Method for Search of Long Repeats in DNA Sequences. In: Botón-Fernández, M., Martín-Vide, C., Santander-Jiménez, S., Vega-Rodríguez, M.A. (eds) Algorithms for Computational Biology. AlCoB 2016. Lecture Notes in Computer Science(), vol 9702. Springer, Cham. https://doi.org/10.1007/978-3-319-38827-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-38827-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-38826-7

  • Online ISBN: 978-3-319-38827-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics