A Modular Database Architecture Enabled to Comparative Sequence Analysis | SpringerLink
Skip to main content

A Modular Database Architecture Enabled to Comparative Sequence Analysis

  • Chapter
Transactions on Large-Scale Data- and Knowledge-Centered Systems IV

Abstract

The beginning of post-genomic era is characterized by a rising numbers of public collected genomes. The evolutionary relationship among these genomes may be caught by means of the comparative analysis of sequences, in order to identify both homologous and non-coding functional elements. In this paper we report on the on-going BIOBITS project. It is focused on studies concerning the bacterial endosymbionts, since they offer an excellent model to investigate important biological events, such as organelle evolution, genome reduction, and transfer of genetic information among host lineages. The BIOBITS goal is two-side: on the one hand, it pursues a logical data representation of genomic and proteomic components. On the other hand, it aims at the development of software modules allowing the user to retrieve and analyze data in a flexible way.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Acedb, http://www.acedb.org/

  2. Aamodt, A., Plaza, E.: Case-Based Reasoning: foundational issues, methodological variations and systems approaches. AI Communications 7, 39–59 (1994)

    Google Scholar 

  3. Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)

    Article  Google Scholar 

  4. Bakker, H., Cummings, C., Ferreira, V., Vatta, P., Orsi, R., Degoriciden ja, L., Barker, M., Petrauskene, O., Furtado, M., Wiedmann, M.: Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss. BMC Genomics 11 (2010)

    Google Scholar 

  5. Bellazzi, R., Larizza, C., Riva, A.: Temporal abstractions for interpreting diabetic patients monitoring data. Intelligent Data Analysis 2, 97–122 (1998)

    Article  Google Scholar 

  6. Bentley, S., Parkhill, J.: Comparative genomic structure of prokaryotes. Annual Review of Genetics 38, 771–792 (2004)

    Article  Google Scholar 

  7. Bianciotto, V., Lumini, E., Bonfante, P., Vandamme, P.: Candidatus Glomeribacter gigasporarum, an endosymbiont of arbuscular mycorrhizal fungi. Int. J. Syst. Evol. Microbiol. 53, 121–124 (2003)

    Article  Google Scholar 

  8. BioMart (2003), http://www.biomart.org/

  9. Bonfante, P., Anca, I.: Plants, Mycorrhizal Fungi, and Bacteria: A Network of Interactions. Annu. Rev. Microbiol. 63, 363–383 (2009)

    Article  Google Scholar 

  10. Carvalho, F., Souza, R., Barcellos, F., Hungria, M., Vasconcelos, A.: Genomic and evolutionary comparisons of diazotrophic and pathogenic bacteria of the order Rhizobiales. BMC Microbiology 10, 1–12 (2010)

    Article  Google Scholar 

  11. Commins, J., Toft, C., Fares, M.: Computational Biology Methods and Their Application to the Comparative Genomics of Endocellular Symbiotic Bacteria of Insects. Biomedical Procedures Online 11, 52–78 (2009)

    Article  Google Scholar 

  12. Cordero, F., Ghignone, S., Lanfranco, L., Leonardi, G., Meo, R., Montani, S., Roversi, L.: BIOBITS: A Study on Candidatus Glomeribacter Gigasporarum with a Data Warehouse. In: Bohm, C. (ed.) Database Technology for Life Sciences and Medicine Claudia Plant, ch. 10, pp. 203–220 (2011)

    Google Scholar 

  13. Cordero, F., Visconti, A., Botta, M.: A new protein motif extraction framework based on constrained co-clustering. In: Proceedings of the 24th Annual ACM Symposium on Applied Computing, pp. 776–781 (2009)

    Google Scholar 

  14. Dhillon, I., Mallela, S., Modha, D.: Information-theoretic co-clustering. In: Proceedings ACM SIGKDD 2003, pp. 89–98 (2003)

    Google Scholar 

  15. Eilbeck, K., Lewis, S.: Sequence Ontology Annotation Guide. Computational Functional Genomics 5(8), 642–647 (2004)

    Article  Google Scholar 

  16. Field, D., Wilson, G., van der Gast, C.: How do we compare hundreds of bacterial genomes? Current Opinion in Microbiology 9, 499–504 (2006)

    Article  Google Scholar 

  17. Finn, R., Mistry, J., Schuster-Bckler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., Moxon, S., Marshall, M., Khanna, A., Durbin, R., Eddy, S., Sonnhammer, E., Bateman, A.: Pfam: clans, web tools and services. Nucleic Acids Res. 34, 247–251 (2006)

    Article  Google Scholar 

  18. Gao, B., Mohan, R., Gupta, R.: Phylogenomics and protein signatures elucidating the evolutionary relationships among the Gammaproteobacteria. International Journal of Systematic and Evolutionary Microbiology 59, 234–247 (2009)

    Article  Google Scholar 

  19. GenBank (2000), http://www.ncbi.nlm.nih.gov/Genbank/

  20. Han, J., Kamber, M.: Data Mining, Concepts and techniques. Academic press, London (2001)

    MATH  Google Scholar 

  21. Hu, J., et al.: The ARKdb: genome databases for farmed and other animals. Nucleic Acids Res. 29, 106–110 (2001)

    Article  Google Scholar 

  22. Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., Castro, E.D., Langendijk-genevaux, P., Pagni, M., Sigrist, C.: The prosite database. Nucleic Acids Res. 34, 227–230 (2006)

    Article  Google Scholar 

  23. Kersey, P.J., Lawson, D., Birney, E., Derwent, P.S., Haimel, M., Herrero, J., Keenan, S., Kerhornou, A., Koscielny, G., Kahari, A., Kinsella, R.J., Kulesha, E., Maheswari, U., Megy, K., Nuhn, M., Proctor, G., Staines, D., Valentin, F., Vilella, A.J., Yates, A.: Ensembl genomes: Extending ensembl across the taxonomic space. Nucleic Acids Research (November 2009), http://dx.doi.org/10.1093/nar/gkp871

  24. Lazzarato, F., Franceschinis, G., Botta, M., Cordero, F., Calogero, R.: RRE: a tool for the extraction of non-coding regions surrounding annotated genes from genomic datasets. Bioinformatics 20, 2848–2850 (2004)

    Article  Google Scholar 

  25. Letunic, I., Copley, R., Pils, B., Pinkert, S., Schultz, J., Bork, P.: SMART 5: domains in the context of genomes and networks. Nucleic Acids Res. 34, 257–260 (2006)

    Article  Google Scholar 

  26. Lumini, E., Ghignone, S., Bianciotto, V., Bonfante, P.: Endobacteria or bacterial endosymbionts? To be or not to be. New Phytol. 170, 205–208 (2006)

    Article  Google Scholar 

  27. MAGE Community, MGED Group: MicroArray Gene Expression (MAGE) Project (2000), http://scgap.systemsbiology.net/standards/mage_miame.php

  28. Montani, S.: Exploring new roles for case-based reasoning in heterogeneous AI systems for medical decision support. Applied Intelligence 28, 275–285 (2008)

    Article  Google Scholar 

  29. Montani, S., Bottrighi, A., Leonardi, G., Portinale, L., Terenziani, P.: Multi-level abstractions and multi-dimensional retrieval of cases with time series features. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS, vol. 5650, pp. 225–239. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  30. Moran, N., McCutcheon, A., Nakabachi, P.: Genomics and evolution of heritable bacterial symbionts. Annu. Rev. Genet. 42, 165–190 (2008)

    Article  Google Scholar 

  31. Mulder, N., Apweiler, R., Attwood, T., Bairoch, A., Bateman, A., Binns, D., Bork, P., Buillard, V., Cerutti, L., Copley, R., Courcelle, E., Das, U., Daugherty, L., Dibley, M., Finn, R., Fleischmann, W., Gough, J., Haft, D., Hulo, N., Hunter, S., Kahn, D., Kanapin, A., Kejariwal, A., Labarga, A., Langendijk-Genevaux, P., Lonsdale, D., Lopez, R., Letunic, I., Madera, M., Maslen, J., McAnulla, C., McDowall, J., Mistry, J., Mitchell, A., Nikolskaya, A., Orchard, S., Orengo, C., Petryszak, R., Selengut, J., Sigrist, C., Thomas, P., Valentin, F., Wilson, D., Wu, C., Yeats, C.: New developments in the InterPro database. Nucleic Acids Res. 35, 224–228 (2007)

    Article  Google Scholar 

  32. Ogier, J., Calteau, A., Forst, S., Goodrich-Blair, H., Roche, D., Rouy, Z., Suen, G., Zumbihl, R., Givaudan, A., Tailliez, P., Medigue, C., Gaudriault, S.: Units of plasticity in bacterial genomes: new insight from the comparative genomics of two bacteria interacting with invertebrates, Photorhabdus and Xenorhabdus. BMC Genomics 11, 1–10 (2010)

    Article  Google Scholar 

  33. Osborne, B.: GMOD Community: GMOD (2000), http://gmod.org/wiki/Main_Page

  34. Pensa, R., Boulicaut, J.F., Cordero, F., Atzori, M.: Co-clustering Numerical Data under User-defined Constraints. Statistical Analysis and Data Mining (2010)

    Google Scholar 

  35. Santoni, D., Romano-Spica, V.: Comparative genomic analysis by microbial COGs self-attraction rate. Journal of Theoretical Biology 258, 513–520 (2009)

    Article  Google Scholar 

  36. Smith, C.A.: Structure, Function and Dynamics in the mur Family of Bacterial Cell Wall Ligases. Journal of Molecular Biology 362, 640–655 (2006)

    Article  Google Scholar 

  37. Thomas, T., Rusch, D., DeMaere, M., Yung, P., Lewis, M., Halpern, A., Heidelberg, K., Egan, S., Steinberg, P., Kjelleberg, S.: Functional genomic signatures of sponge bacteria reveal unique and shared features of symbiosis. ISME Journal 4, 1557–1567 (2010)

    Article  Google Scholar 

  38. Watson, I.: Applying Case-Based Reasoning: techniques for enterprise systems. Morgan Kaufmann, San Francisco (1997)

    MATH  Google Scholar 

  39. Xu, Y.: Computational Challenges in Deciphering Genomic Structures of Bacteria. Journal of Computer Science and Technology 25, 53–73 (2009)

    Article  Google Scholar 

  40. Shahar, Y.: A framework for knowledge-based temporal abstractions. Artificial Intelligence 90, 79–133 (1997)

    Article  MATH  Google Scholar 

  41. Zucko, J., Dunlap, W., Shick, J., Cullum, J., Cercelet, F., Amin, B., Hammen, L., Lau, T., Williams, J., Hranueli, D., Long, P.: Global genome analysis of the shikimic acid pathway reveals greater gene loss in host-associated than in free-living bacteria. BMC Genomics 11 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bonfante, P. et al. (2011). A Modular Database Architecture Enabled to Comparative Sequence Analysis. In: Hameurlain, A., Küng, J., Wagner, R., Böhm, C., Eder, J., Plant, C. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems IV. Lecture Notes in Computer Science, vol 6990. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23740-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23740-9_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23739-3

  • Online ISBN: 978-3-642-23740-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics