Linking Life Sciences Data Using Graph-Based Mapping | SpringerLink
Skip to main content

Linking Life Sciences Data Using Graph-Based Mapping

  • Conference paper
Data Integration in the Life Sciences (DILS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5647))

Included in the following conference series:

Abstract

There are over 1100 different databases available containing primary and derived data of interest to research biologists. It is inevitable that many of these databases contain overlapping, related or conflicting information. Data integration methods are being developed to address these issues by providing a consolidated view over multiple databases. However, a key challenge for data integration is the identification of links between closely related entries in different life sciences databases when there is no direct information that provides a reliable cross-reference. Here we describe and evaluate three data integration methods to address this challenge in the context of a graph-based data integration framework (the ONDEX system). A key result presented in this paper is a quantitative evaluation of their performance in two different situations: the integration and analysis of different metabolic pathways resources and the mapping of equivalent elements between the Gene Ontology and a nomenclature describing enzyme function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Köhler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Rueegg, A., Rawlings, C., Verrier, P., Philippi, S.: Graph-based analysis and visualization of experimental results with ONDEX. Bioinformatics 22, 1383–1390 (2006)

    Article  PubMed  Google Scholar 

  2. Gaylord, M., Calley, J., Qiang, H., Su, E.W., Liao, B.: A flexible integration and visualisation system for biomarker discovery. Applied bioinformatics 5, 219–223 (2006)

    Article  CAS  PubMed  Google Scholar 

  3. Fischer, H.P.: Towards quantitative biology: integration of biological information to elucidate disease pathways and to guide drug discovery. Biotechnol. Annu. Rev. 11, 1–68 (2005)

    Article  CAS  PubMed  Google Scholar 

  4. Etzold, T., Ulyanov, A., Argos, P.: SRS: information retrieval system for molecular biology data banks. Methods Enzymol. 266, 114–128 (1996)

    Article  CAS  PubMed  Google Scholar 

  5. Baitaluk, M., Qian, X., Godbole, S., Raval, A., Ray, A., Gupta, A.: PathSys: integrating molecular interaction graphs for systems biology. BMC bioinformatics 7, 55 (2006)

    Article  PubMed  PubMed Central  Google Scholar 

  6. Küntzer, J., Blum, T., Gerasch, A., Backes, C., Hildebrandt, A., Kaufmann, M., Kohlbacher, O., Lenhof, H.-P.: BN++ - A Biological Information System. Journal of Integrative Bioinformatics 3 (2006)

    Google Scholar 

  7. Köhler, J., Rawlings, C., Verrier, P., Mitchell, R., Skusa, A., Ruegg, A., Philippi, S.: Linking experimental results, biological networks and sequence analysis methods using Ontologies and Generalized Data Structures. Silico. Biol. 5, 33–44 (2004)

    Google Scholar 

  8. Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., Ideker, T.: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. RIKEN: Semantic Web Folders (2009)

    Google Scholar 

  10. Köhler, J., Philippi, S., Specht, M., Rüegg, A.: Ontology based text indexing and querying for the semantic web. Know.-Based Syst. 19, 744–754 (2006)

    Article  Google Scholar 

  11. Bairoch, A.: The ENZYME database in 2000. Nucleic Acids Res. 28, 304–305 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., Hattori, M.: The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–D280 (2004)

    Article  Google Scholar 

  14. Mueller, L.A., Zhang, P., Rhee, S.Y.: AraCyc: a biochemical pathway database for Arabidopsis. Plant Physiol. 132, 453–460 (2003)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Taubert, J., Sieren, K.P., Hindle, M., Hoekman, B., Winnenburg, R., Philippi, S., Rawlings, C., Köhler, J.: The OXL format for the exchange of integrated datasets. Journal of Integrative Bioinformatics 4 (2007)

    Google Scholar 

  16. Smith, B.: Beyond Concepts: Ontology as Reality Representation. In: Varzi, A., Vieu, L. (eds.) Proceedings of FOIS (2004)

    Google Scholar 

  17. Baldwin, T.K., Winnenburg, R., Urban, M., Rawlings, C., Köhler, J., Hammond-Kosack, K.E.: PHI-base provides insights into generic and novel themes of pathogenicity. Molecular Plant-Microbe Interactions 19, 1451–1462 (2006)

    Article  CAS  PubMed  Google Scholar 

  18. Winnenburg, R., Baldwin, T.K., Urban, M., Rawlings, C., Köhler, J., Hammond-Kosack, K.E.: PHI-base: A new database for Pathogen Host Interactions. Nucleic Acids Res. 34(Database issue), D459–D464 (2006)

    Article  Google Scholar 

  19. Köhler, J., Munn, K., Rüegg, A., Skusa, A., Smith, B.: Quality Control for Terms and Definitions in Ontologies and Taxonomies. BMC Bioinformatics 7, 212 (2006)

    Article  PubMed  PubMed Central  Google Scholar 

  20. Goutte, C., Gaussier, É.: A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  21. Green, M.L., Karp, P.D.: The outcomes of pathway database computations depend on pathway ontology. Nucl. Acids Res. 34, 3687–3697 (2006)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Buntrock, R.E.: Chemical registries–in the fourth decade of service. J. Chem. Inf. Comput. Sci. 41, 259–263 (2001)

    Article  CAS  PubMed  Google Scholar 

  23. Meinke, D.: Genetic nomenclature guide. Arabidopsis thaliana. Trends in genetics, 22–23 (1995)

    Google Scholar 

  24. Zhang, L., Gu, J.-G.: Ontology based semantic mapping architecture. In: Fourth International Conference on Machine Learning and Cybernetics. IEEE, Los Alamitos (2005)

    Google Scholar 

  25. Nov, N.: The Prompt Tab, vol. 2008 (2005)

    Google Scholar 

  26. Marquet, G., Mosser, J., Burgun, A.: A method exploiting syntactic patterns and the UMLS semantics for aligning biomedical ontologies: The case of OBO disease ontologies. Int. J. Med. Inform. 76(suppl. 3), S353–S361 (2007)

    Article  Google Scholar 

  27. Racunas, S.A., Shah, N.H., Fedoroff, N.V.: A case study in pathway knowledgebase verification. BMC bioinformatics 7, 196 (2006)

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Taubert, J., Hindle, M., Lysenko, A., Weile, J., Köhler, J., Rawlings, C.J. (2009). Linking Life Sciences Data Using Graph-Based Mapping. In: Paton, N.W., Missier, P., Hedeler, C. (eds) Data Integration in the Life Sciences. DILS 2009. Lecture Notes in Computer Science(), vol 5647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02879-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02879-3_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02878-6

  • Online ISBN: 978-3-642-02879-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics