Abstract
Technical signs of progress during the last decades has led to a situation in which the accumulation of genome sequence data is increasingly fast and cheap. The huge amount of molecular data available nowadays can help addressing new and essential questions in Evolution. However, reconstructing evolution of DNA sequences requires models, algorithms, statistical and computational methods of ever increasing complexity. Since most dramatic genomic changes are caused by genome rearrangements (gene duplications, gain/loss events), it becomes crucial to understand their mechanisms and reconstruct ancestors of the given genomes. This problem is known to be NP-complete even in the “simplest” case of three genomes. Heuristic algorithms are usually executed to provide approximations of the exact solution. We state that, even if the ancestral reconstruction problem is NP-hard in theory, its exact resolution is feasible in various situations, encompassing organelles and some bacteria. Such accurate reconstruction, which identifies too some highly homoplasic mutations whose ancestral status is undecidable, will be initiated in this work-in-progress, to reconstruct ancestral genomes of two Mycobacterium pathogenetic bacterias. By mixing automatic reconstruction of obvious situations with human interventions on signaled problematic cases, we will indicate that it should be possible to achieve a concrete, complete, and really accurate reconstruction of lineages of the Mycobacterium tuberculosis complex. Thus, it is possible to investigate how these genomes have evolved from their last common ancestors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
ftp://ftp.ncbi.nih.gov/genomes.
References
Smith, N.H., Gordon, S.V., de la Rua-Domenech, R., Clifton-Hadley, R.S., Hewinson, R.G.: Bottlenecks and broomsticks: the molecular evolution of mycobacterium bovis. Nat. Rev. Microbiol. 4(9), 670–681 (2006)
Shamputa, I.C., SangNae, C., Lebron, J., Via, L.E.: Introduction and epidemiology of mycobacterium tuberculosis complex in humans. In: Mukundan, H., Chambers, M.A., Waters, W.R., Larsen, M.H. (eds.) Tuberculosis, Leprosy and Mycobacterial Diseases of Man and Animals: The Many Hosts of Mycobacteria, pp. 1–16. CABI (2015). http://www.cabi.org/cabebooks/ebook/20153322769
Brosch, R., Gordon, S.V., Marmiesse, M., Brodin, P., Buchrieser, C., Eiglmeier, K., Garnier, T., Gutierrez, C., Hewinson, G., Kremer, K., et al.: A new evolutionary scenario for the mycobacterium tuberculosis complex. Proc. Natl. Acad. Sci. 99(6), 3684–3689 (2002)
Gutacker, M.M., Smoot, J.C., Migliaccio, C.A.L., Ricklefs, S.M., Hua, S., Cousins, D.V., Graviss, E.A., Shashkina, E., Kreiswirth, B.N., Musser, J.M.: Genome-wide analysis of synonymous single nucleotide polymorphisms in mycobacterium tuberculosis complex organisms: resolution of genetic relationships among closely related microbial strains. Genetics 162(4), 1533–1543 (2002)
Mostowy, S., Cousins, D., Brinkman, J., Aranaz, A., Behr, M.A.: Genomic deletions suggest a phylogeny for the mycobacterium tuberculosis complex. J. Infect. Dis. 186(1), 74–80 (2002)
Yamada-Noda, M., Ohkusu, K., Hata, H., Shah, M.M., Nhung, P.H., Sun, X.S., Hayashi, M., Ezaki, T.: Mycobacterium species identification-a new approach via dnaJ gene sequencing. Syst. Appl. Microbiol. 30(6), 453–462 (2007)
Fabre, M., Hauck, Y., Soler, C., Koeck, J.-L., Van Ingen, J., Van Soolingen, D., Vergnaud, G., Pourcel, C.: Molecular characteristics of mycobacterium canettii the smooth mycobacterium tuberculosis bacilli. Infect. Genet. Evol. 10(8), 1165–1173 (2010)
Fleischmann, R.D., Alland, D., Eisen, J.A., Carpenter, L., White, O., Peterson, J., DeBoy, R., Dodson, R., Gwinn, M., Haft, D., et al.: Whole-genome comparison of mycobacterium tuberculosis clinical and laboratory strains. J. Bacteriol. 184(19), 5479–5490 (2002)
Wirth, T., Hildebrand, F., Allix-Béguec, C., Wölbeling, F., Kubica, T., Kremer, K., van Soolingen, D., Rüsch-Gerdes, S., Locht, C., Brisse, S., et al.: Origin, spread and demography of the mycobacterium tuberculosis complex. PLoS Pathog 4(9), e1000160 (2008)
Lang, G.I., Murray, A.W.: Estimating the per-base-pair mutation rate in the yeast saccharomyces cerevisiae. Genetics 178(1), 67–82 (2008)
Fertin, G.: Combinatorics of Genome Rearrangements. MIT Press, Cambridge (2009)
Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Zhang, L., Miller, W., Haussler, D.: DUPCAR: reconstructing contiguous ancestral regions with duplications. J. Comput. Biol. 15(8), 1007–1027 (2008)
Gagnon, Y., Blanchette, M., El-Mabrouk, N.: A flexible ancestral genome reconstruction method based on gapped adjacencies. BMC Bioinform. 13(Suppl 19), S4 (2012)
Jones, B.R., Rajaraman, A., Tannier, E., Chauve, C.: ANGES: reconstructing ancestral genomes maps. Bioinformatics 28(18), 2388–2390 (2012)
Ma, J., Zhang, L., Suh, B.B., Raney, B.J., Burhans, R.C., Kent, W.J., Blanchette, M., Haussler, D., Miller, W.: Reconstructing contiguous regions of an ancestral genome. Genome Res. 16(12), 1557–1565 (2006)
Fei, H., Zhou, J., Zhou, L., Tang, J.: Probabilistic reconstruction of ancestral gene orders with insertions and deletions. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(4), 667–672 (2014)
Blanchette, M., Diallo, A.B., Green, E.D., Miller, W., Haussler, D.: Computational reconstruction of ancestral DNA sequences. In: Murphy, W.J. (ed.) Phylogenomics, pp. 171–184. Springer, Heidelberg (2008)
Rascol, V.L., Pontarotti, P., Levasseur, A.: Ancestral animal genomes reconstruction. Curr. Opin. Immunol. 19(5), 542–546 (2007)
Larget, B., Simon, D.L., Kadane, J.B., Sweet, D.: A Bayesian analysis of metazoan mitochondrial genome arrangements. Mol. Biol. Evol. 22(3), 486–495 (2005)
Hannenhalli, S., Chappey, C., Koonin, E.V., Pevzner, P.A.: Genome sequence comparison and scenarios for gene rearrangements: a test case. Genomics 30(2), 299–311 (1995)
Stamatakis, A.: RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9), 1312–1313 (2014)
Bouckaert, R., Heled, J., Kühnert, D., Vaughan, T., Chieh-Hsi, W., Xie, D., Suchard, M.A., Rambaut, A., Drummond, A.J.: BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 10(4), e1003537 (2014)
Yang, Z.: Phylogenetic analysis by maximum likelihood (PAML) (2000)
Paradis, E., Claude, J., Strimmer, K.: APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20(2), 289–290 (2004)
Bouchard-Côté, A., Jordan, M.I.: Evolutionary inference via the Poisson Indel Process. Proc. Natl. Acad. Sci. 110(4), 1160–1166 (2013)
Watterson, G.A., Ewens, W.J., Hall, T.E., Morgan, A.: The chromosome inversion problem. J. Theoret. Biol. 99(1), 1–7 (1982)
Even, S., Goldreich, O.: The minimum-length generator sequence problem is NP-hard. J. Algorithms 2(3), 311–313 (1981)
Stoebe, B., Martin, W., Kowallik, K.V.: Distribution and nomenclature of protein-coding genes in 12 sequenced chloroplast genomes. Plant Mol. Biol. Reporter 16(3), 243–255 (1998)
Grzebyk, D., Schofield, O., Vetriani, C., Falkowski, P.G.: The mesozoic radiation of eukaryotic algae: the portable plastid hypothesis1. J. Phycol. 39(2), 259–267 (2003)
Sharon, I., Alperovitch, A., Rohwer, F., Haynes, M., Glaser, F., Atamna-Ismaeel, N., Pinter, R.Y., Partensky, F., Koonin, E.V., Wolf, Y.I., et al.: Photosystem I gene cassettes are present in marine virus genomes. Nature 461(7261), 258–262 (2009)
De Chiara, M., Hood, D., Muzzi, A., Pickard, D.J., Perkins, T., Pizza, M., Dougan, G., Rino Rappuoli, E., Moxon, R., Soriani, M., et al.: Genome sequencing of disease and carriage isolates of nontypeable haemophilus influenzae identifies discrete population structure. Proc. Natl. Acad. Sci. 111(14), 5439–5444 (2014)
Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., Salzberg, S.L.: Versatile and open software for comparing large genomes. Genome Biol. 5(2), 1 (2004)
Touchon, M., Hoede, C., Tenaillon, O., Barbe, V., Baeriswyl, S., Bidet, P., Bingen, E., Bonacorsi, S., Bouchier, C., Bouvet, O., et al.: Organised genome dynamics in the escherichia coli species results in highly diverse adaptive paths. PLoS Genet 5(1), e1000344 (2009)
Boissy, R., Ahmed, A., Janto, B., Earl, J., Hall, B.G., Hogg, J.S., Pusch, G.D., Hiller, L.N., Powell, E., Hayes, J., et al.: Comparative supragenomic analyses among the pathogens staphylococcus aureus, streptococcus pneumoniae, and haemophilus influenzae using a modification of the finite supragenome model. BMC Genom. 12(1), 1 (2011)
Tettelin, H., Masignani, V., Cieslewicz, M.J., Donati, C., Medini, D., Ward, N.L., Angiuoli, S.V., Crabtree, J., Jones, A.L., Durkin, A.S., et al.: Genome analysis of multiple pathogenic isolates of streptococcus agalactiae: implications for the microbial pan-genome. In: Proceedings of the National Academy of Sciences of the United States of America 102(39), pp. 13950–13955 (2005)
Valot, B., Guyeux, C., Rolland, J.Y., Mazouzi, K., Bertrand, X., Hocquet, D.: What it takes to be a Pseudomonas aeruginosa? The core genome of the opportunistic pathogen updated. PLoS One 10(5), e0126468 (2015)
Yang, J., Li, J., Dong, L., Grünewald, S.: Analysis on the reconstruction accuracy of the fitch method for inferring ancestral states. BMC Bioinform. 12(1), 18 (2011)
Wang, Y., Sadreyev, R.I., Grishin, N.V.: PROCAIN server for remote protein sequence similarity search. Bioinformatics 25(16), 2076–2077 (2009)
Kemena, C., Notredame, C.: Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics 25(19), 2455–2465 (2009)
Warnow, T.: Large-scale multiple sequence alignment and phylogeny estimation. In: Chauve, C., El-Mabrouk, N., Tannier, E. (eds.) Models and Algorithms for Genome Evolution, pp. 85–146. Springer, Heidelberg (2013)
R Development Core Team: R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria (2014)
Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5(10), 1 (2004)
Wright, E.S.: The art of multiple sequence alignment in R (2014)
Alkindy, B., Guyeux, C., Couchot, J.-F., Salomon, M., Bahi, J.: Using genetic algorithm for optimizing phylogenetic tree inference in plant species. In: MCEB15, Mathematical and Computational Evolutionary Biology, Porquerolles Island, France, June 2015
Alkindy, B., Al-Nuaimi, B., Guyeux, C., Couchot, J.-F., Salomon, M., Alsrraj, R., Philippe, L.: Binary particle swarm optimization versus hybrid genetic algorithm for inferring well supported phylogenetic trees. In: Angelini, C., Rancoita, P.M.V., Rovetta, S. (eds.) CIBB 2015. LNCS, vol. 9874, pp. 165–179. Springer, Heidelberg (2016). doi:10.1007/978-3-319-44332-4_13
AlKindy, B., Guyeux, C., Couchot, J.-F., Salomon, M., Parisod, C., Bahi, J.M.: Hybrid genetic algorithm and lasso test approach for inferring well supported phylogenetic trees based on subsets of chloroplastic core genes. In: Dediu, A.-H., Hernández-Quiroz, F., Martín-Vide, C., Rosenblueth, D.A. (eds.) AlCoB 2015. LNCS, vol. 9199, pp. 83–96. Springer, Heidelberg (2015). doi:10.1007/978-3-319-21233-3_7
Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: AAAI, pp. 133–136 (1982)
Hubisz, M.J., Pollard, K.S., Siepel, A.: PHAST and RPHAST: phylogenetic analysis with space/time models. Briefings Bioinform. 12(1), 41–51 (2011). doi:10.1093/bib/bbq072
Behr, M.A.: Evolution of mycobacterium tuberculosis. In: Divangahi, M. (ed.) The New Paradigm of Immunity to Tuberculosis, vol. 783, pp. 81–91. Springer, Heidelberg (2013)
Siguier, P., Filée, J., Chandler, M.: Insertion sequences in prokaryotic genomes. Curr. Opin. Microbiol. 9(5), 526–531 (2006)
Bergman, C.M., Quesneville, H.: Discovering and detecting transposable elements in genome sequences. Briefings Bioinform. 8(6), 382–392 (2007)
Acknowledgments
Computations presented in this article were realised on the supercomputing facilities provided by the Mésocentre de calcul de Franche-Comté.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Guyeux, C., Al-Nuaimi, B., AlKindy, B., Couchot, JF., Salomon, M. (2017). On the Ability to Reconstruct Ancestral Genomes from Mycobacterium Genus. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10208. Springer, Cham. https://doi.org/10.1007/978-3-319-56148-6_57
Download citation
DOI: https://doi.org/10.1007/978-3-319-56148-6_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56147-9
Online ISBN: 978-3-319-56148-6
eBook Packages: Computer ScienceComputer Science (R0)