{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,9,25]],"date-time":"2023-09-25T16:31:51Z","timestamp":1695659511592},"reference-count":64,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1752,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,1,15]]},"abstract":"Abstract<\/jats:title>\n Motivation: Numerous annotations are available that functionally characterize genes and proteins with regard to molecular process, cellular localization, tissue expression, protein domain composition, protein interaction, disease association and other properties. Searching this steadily growing amount of information can lead to the discovery of new biological relationships between genes and proteins. To facilitate the searches, methods are required that measure the annotation similarity of genes and proteins. However, most current similarity methods are focused only on annotations from the Gene Ontology (GO) and do not take other annotation sources into account.<\/jats:p>\n Results: We introduce the new method BioSim that incorporates multiple sources of annotations to quantify the functional similarity of genes and proteins. We compared the performance of our method with four other well-known methods adapted to use multiple annotation sources. We evaluated the methods by searching for known functional relationships using annotations based only on GO or on our large data warehouse BioMyn. This warehouse integrates many diverse annotation sources of human genes and proteins. We observed that the search performance improved substantially for almost all methods when multiple annotation sources were included. In particular, our method outperformed the other methods in terms of recall and average precision.<\/jats:p>\n Contact: \u00a0mario.albrecht@mpi-inf.mpg.de<\/jats:p>\n Supplementary Information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr631","type":"journal-article","created":{"date-parts":[[2011,12,17]],"date-time":"2011-12-17T02:30:57Z","timestamp":1324089057000},"page":"269-276","source":"Crossref","is-referenced-by-count":8,"title":["Novel search method for the discovery of functional relationships"],"prefix":"10.1093","volume":"28","author":[{"given":"Fidel","family":"Ram\u00edrez","sequence":"first","affiliation":[{"name":"Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbr\u00fccken, Germany"}]},{"given":"Glenn","family":"Lawyer","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbr\u00fccken, Germany"}]},{"given":"Mario","family":"Albrecht","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbr\u00fccken, Germany"}]}],"member":"286","published-online":{"date-parts":[[2011,12,16]]},"reference":[{"key":"2023012511493888000_B1","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1038\/nbt1203","article-title":"Gene prioritization through genomic data fusion","volume":"24","author":"Aerts","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023012511493888000_B2","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023012511493888000_B3","doi-asserted-by":"crossref","first-page":"D793","DOI":"10.1093\/nar\/gkn665","article-title":"McKusick's Online Mendelian Inheritance in Man (OMIM)","volume":"37","author":"Amberger","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B4","doi-asserted-by":"crossref","first-page":"S14","DOI":"10.1038\/nrg2255","article-title":"Nature Milestones in DNA technologies, Milestone 15: BLAST-off for genomes","volume":"8","author":"Bahcall","year":"2007","journal-title":"Nat. Rev. Genet."},{"key":"2023012511493888000_B5","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1093\/nar\/28.1.304","article-title":"The ENZYME database in 2000","volume":"28","author":"Bairoch","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B6","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1186\/1471-2105-11-588","article-title":"IntelliGO: a new vector-based semantic similarity measure including annotation origin","volume":"11","author":"Benabderrahmane","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012511493888000_B7","doi-asserted-by":"crossref","first-page":"980","DOI":"10.1038\/nsb1203-980","article-title":"Announcing the worldwide Protein Data Bank","volume":"10","author":"Berman","year":"2003","journal-title":"Nat. Struct. Biol."},{"key":"2023012511493888000_B8","doi-asserted-by":"crossref","first-page":"D842","DOI":"10.1093\/nar\/gkq1008","article-title":"The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics","volume":"39","author":"Blake","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B9","first-page":"33","article-title":"Evaluating evaluation measure stability","volume-title":"Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2000).","author":"Buckley","year":"2000"},{"key":"2023012511493888000_B10","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1016\/j.ejcb.2007.11.003","article-title":"ICA69 is a novel Rab2 effector regulating ER-Golgi trafficking in insulinoma cells","volume":"87","author":"Buffa","year":"2008","journal-title":"Eur. J. Cell Biol."},{"key":"2023012511493888000_B11","first-page":"5","article-title":"The Gene Ontology Annotation (GOA) Database \u2013 an integrated resource of GO annotations to the UniProt Knowledgebase","volume":"4","author":"Camon","year":"2004","journal-title":"In Silico Biol."},{"key":"2023012511493888000_B12","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1186\/1471-2105-8-235","article-title":"A transversal approach to predict gene product networks from ontology-based similarity","volume":"8","author":"Chabalier","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012511493888000_B13","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.tips.2009.11.002","article-title":"Recent advances and method development for drug target identification","volume":"31","author":"Chan","year":"2010","journal-title":"Trends Pharmacol. Sci."},{"key":"2023012511493888000_B14","doi-asserted-by":"crossref","first-page":"D572","DOI":"10.1093\/nar\/gkl950","article-title":"MINT: the Molecular INTeraction database","volume":"35","author":"Chatr-Aryamontri","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B15","doi-asserted-by":"crossref","first-page":"D363","DOI":"10.1093\/nar\/gkj123","article-title":"OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups","volume":"34","author":"Chen","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B16","doi-asserted-by":"crossref","first-page":"D142","DOI":"10.1093\/nar\/gkp846","article-title":"The Universal Protein Resource (UniProt) in 2010","volume":"38","author":"Consortium","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B17","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1186\/1471-2105-9-50","article-title":"Defining functional distances over Gene Ontology","volume":"9","author":"del Pozo","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012511493888000_B18","doi-asserted-by":"crossref","first-page":"D281","DOI":"10.1093\/nar\/gkm960","article-title":"The Pfam protein families database","volume":"36","author":"Finn","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B19","doi-asserted-by":"crossref","first-page":"D707","DOI":"10.1093\/nar\/gkm988","article-title":"Ensembl 2008","volume":"36","author":"Flicek","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B20","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1093\/bib\/bbl004","article-title":"Automated protein function prediction\u2013the genomic challenge","volume":"7","author":"Friedberg","year":"2006","journal-title":"Brief. Bioinform."},{"key":"2023012511493888000_B21","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/S0097-8485(96)80004-0","article-title":"Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching","volume":"20","author":"Gribskov","year":"1996","journal-title":"Comput. Chem."},{"key":"2023012511493888000_B22","doi-asserted-by":"crossref","first-page":"R183","DOI":"10.1186\/gb-2007-8-9-r183","article-title":"The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists","volume":"8","author":"Huang","year":"2007","journal-title":"Genome Biol."},{"key":"2023012511493888000_B23","doi-asserted-by":"crossref","first-page":"D211","DOI":"10.1093\/nar\/gkn785","article-title":"InterPro: the integrative protein signature database","volume":"37","author":"Hunter","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B24","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1101\/gr.082214.108","article-title":"Exploring the human genome with functional maps","volume":"19","author":"Huttenhower","year":"2009","journal-title":"Genome Res."},{"key":"2023012511493888000_B25","doi-asserted-by":"crossref","first-page":"D412","DOI":"10.1093\/nar\/gkn760","article-title":"STRING 8 \u2013 a global view on proteins and their functional interactions in 630 organisms","volume":"37","author":"Jensen","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B26","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkm882","article-title":"KEGG for linking genomes to life and the environment","volume":"36","author":"Kanehisa","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B27","doi-asserted-by":"crossref","first-page":"D561","DOI":"10.1093\/nar\/gkl958","article-title":"IntAct \u2013 open source resource for molecular interaction data","volume":"35","author":"Kerrien","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B28","doi-asserted-by":"crossref","first-page":"11334","DOI":"10.1073\/pnas.0702965104","article-title":"Defining functional distance using manifold embeddings of Gene Ontology annotations","volume":"104","author":"Lerman","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511493888000_B29","first-page":"296","article-title":"An information-theoretic definition of similarity","volume-title":"Proceedings of the 15th International Conference on Machine Learning (ICML-98), Madison, WI, USA.","author":"Lin","year":"1998"},{"key":"2023012511493888000_B30","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1093\/bioinformatics\/btg153","article-title":"Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation","volume":"19","author":"Lord","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012511493888000_B31","doi-asserted-by":"crossref","first-page":"D619","DOI":"10.1093\/nar\/gkn863","article-title":"Reactome knowledgebase of human biological pathways and processes","volume":"37","author":"Matthews","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B32","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/0092-8674(95)90239-2","article-title":"Complexins: cytosolic proteins that regulate SNAP receptor function","volume":"83","author":"McMahon","year":"1995","journal-title":"Cell"},{"key":"2023012511493888000_B33","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1186\/1471-2105-9-327","article-title":"Gene Ontology term overlap as a measure of gene functional similarity","volume":"9","author":"Mistry","year":"2008","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 5","key":"2023012511493888000_B34","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2105-9-S5-S4","article-title":"Metrics for GO based protein semantic similarity: a systematic evaluation","volume":"9","author":"Pesquita","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012511493888000_B35","doi-asserted-by":"crossref","first-page":"e1000443","DOI":"10.1371\/journal.pcbi.1000443","article-title":"Semantic similarity in biomedical ontologies","volume":"5","author":"Pesquita","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012511493888000_B36","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1109\/TCBB.2006.37","article-title":"Fuzzy measures on the Gene Ontology for gene product similarity","volume":"3","author":"Popescu","year":"2006","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023012511493888000_B37","doi-asserted-by":"crossref","first-page":"D767","DOI":"10.1093\/nar\/gkn892","article-title":"Human Protein Reference Database \u2013 2009 update","volume":"37","author":"Prasad","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B38","doi-asserted-by":"crossref","first-page":"2541","DOI":"10.1002\/pmic.200600924","article-title":"Computational analysis of human protein interaction networks","volume":"7","author":"Ram\u00edrez","year":"2007","journal-title":"Proteomics"},{"key":"2023012511493888000_B39","doi-asserted-by":"crossref","first-page":"2767","DOI":"10.1093\/bioinformatics\/btn528","article-title":"The Protein Feature Ontology: a tool for the unification of protein feature annotations","volume":"24","author":"Reeves","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012511493888000_B40","first-page":"448","article-title":"Using information content to evaluate semantic similarity in a Taxonomy","volume-title":"Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95), Montreal, Canada","author":"Resnik","year":"1995"},{"key":"2023012511493888000_B41","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1613\/jair.514","article-title":"Semantic similarity in a Taxonomy: an information-based measure and its application to problems of ambiguity in natural language","volume":"11","author":"Resnik","year":"1999","journal-title":"J. Artif. Intell. Res."},{"key":"2023012511493888000_B42","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1038\/nrg2363","article-title":"Use and misuse of the gene ontology annotations","volume":"9","author":"Rhee","year":"2008","journal-title":"Nat. Rev. Genet."},{"key":"2023012511493888000_B43","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1038\/nbt1103","article-title":"Probabilistic model of the human protein-protein interaction network","volume":"23","author":"Rhodes","year":"2005","journal-title":"Nat. Biotechnol."},{"key":"2023012511493888000_B44","doi-asserted-by":"crossref","first-page":"R2","DOI":"10.1186\/gb-2004-6-1-r2","article-title":"Computational prediction of human metabolic pathways from the complete human genome","volume":"6","author":"Romero","year":"2005","journal-title":"Genome Biol."},{"key":"2023012511493888000_B45","doi-asserted-by":"crossref","first-page":"D646","DOI":"10.1093\/nar\/gkm936","article-title":"CORUM: the comprehensive resource of mammalian protein complexes","volume":"36","author":"Ruepp","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B46","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1145\/361219.361220","article-title":"A vector space model for automatic indexing","volume":"18","author":"Salton","year":"1975","journal-title":"Commun. ACM"},{"key":"2023012511493888000_B47","doi-asserted-by":"crossref","first-page":"D449","DOI":"10.1093\/nar\/gkh086","article-title":"The Database of Interacting Proteins: 2004 update","volume":"32","author":"Salwinski","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B48","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1186\/1471-2105-7-302","article-title":"A new measure for functional similarity of gene products based on Gene Ontology","volume":"7","author":"Schlicker","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012511493888000_B49","doi-asserted-by":"crossref","first-page":"i561","DOI":"10.1093\/bioinformatics\/btq384","article-title":"Improving disease gene prioritization using the semantic similarity of Gene Ontology terms","volume":"26","author":"Schlicker","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012511493888000_B50","doi-asserted-by":"crossref","first-page":"e1000605","DOI":"10.1371\/journal.pcbi.1000605","article-title":"Annotation error in public databases: misannotation of molecular function in enzyme superfamilies","volume":"5","author":"Schnoes","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012511493888000_B51","doi-asserted-by":"crossref","first-page":"D514","DOI":"10.1093\/nar\/gkq892","article-title":"genenames.org: the HGNC resources in 2011","volume":"39","author":"Seal","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B52","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1109\/TCBB.2005.50","article-title":"Correlation between gene expression and GO semantic similarity","volume":"2","author":"Sevilla","year":"2005","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023012511493888000_B53","doi-asserted-by":"crossref","first-page":"3940","DOI":"10.1093\/bioinformatics\/bti623","article-title":"ROCR: visualizing classifier performance in R","volume":"21","author":"Sing","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012511493888000_B54","first-page":"252","article-title":"A memetic clustering algorithm for the functional partition of genes based on the Gene Ontology","volume-title":"Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2004), La Jolla, CA, USA","author":"Speer","year":"2004"},{"key":"2023012511493888000_B55","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1016\/S0092-8674(01)00240-9","article-title":"Obesity and the regulation of energy balance","volume":"104","author":"Spiegelman","year":"2001","journal-title":"Cell"},{"key":"2023012511493888000_B56","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniProt reference clusters","volume":"23","author":"Suzek","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012511493888000_B57","doi-asserted-by":"crossref","first-page":"4465","DOI":"10.1073\/pnas.012025199","article-title":"Large-scale analysis of the human and mouse transcriptomes","volume":"99","author":"Su","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511493888000_B58","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1126\/science.1092472","article-title":"In vivo activation of the p53 pathway by small-molecule antagonists of MDM2","volume":"303","author":"Vassilev","year":"2004","journal-title":"Science"},{"key":"2023012511493888000_B59","doi-asserted-by":"crossref","first-page":"D262","DOI":"10.1093\/nar\/gki058","article-title":"E-MSD: an integrated data resource for bioinformatics","volume":"33","author":"Velankar","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B60","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1016\/j.cell.2011.02.016","article-title":"Interactome networks and human disease","volume":"144","author":"Vidal","year":"2011","journal-title":"Cell"},{"key":"2023012511493888000_B61","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1186\/1471-2105-11-290","article-title":"Revealing and avoiding bias in semantic similarity scores for protein pairs","volume":"11","author":"Wang","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012511493888000_B62","doi-asserted-by":"crossref","first-page":"2277","DOI":"10.1016\/j.jprot.2010.07.005","article-title":"It's the machine that matters: Predicting gene function and phenotype from protein networks","volume":"73","author":"Wang","year":"2010","journal-title":"J. Proteomics"},{"key":"2023012511493888000_B63","doi-asserted-by":"crossref","first-page":"W214","DOI":"10.1093\/nar\/gkq537","article-title":"The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function","volume":"38","author":"Warde-Farley","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012511493888000_B64","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1021\/ci9800211","article-title":"Chemical Similarity Searching","volume":"38","author":"Willett","year":"1998","journal-title":"J. Chem. Informat. Comput. Sci."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/2\/269\/48868390\/bioinformatics_28_2_269.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/2\/269\/48868390\/bioinformatics_28_2_269.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T11:51:16Z","timestamp":1674647476000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/2\/269\/197411"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,12,16]]},"references-count":64,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2012,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr631","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,1,15]]},"published":{"date-parts":[[2011,12,16]]}}}