{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,10]],"date-time":"2024-09-10T14:43:13Z","timestamp":1725979393730},"reference-count":17,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2334,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,7,1]]},"abstract":"Abstract<\/jats:title>\n Motivation: New generation sequencing technologies producing increasingly complex datasets demand new efficient and specialized sequence analysis algorithms. Often, it is only the \u2018novel\u2019 sequences in a complex dataset that are of interest and the superfluous sequences need to be removed.<\/jats:p>\n Results: A novel algorithm, fast and accurate classification of sequences (FACSs), is introduced that can accurately and rapidly classify sequences as belonging or not belonging to a reference sequence. FACS was first optimized and validated using a synthetic metagenome dataset. An experimental metagenome dataset was then used to show that FACS achieves comparable accuracy as BLAT and SSAHA2 but is at least 21 times faster in classifying sequences.<\/jats:p>\n Availability: Source code for FACS, Bloom filters and MetaSim dataset used is available at http:\/\/facs.biotech.kth.se. The Bloom::Faster 1.6 Perl module can be downloaded from CPAN at http:\/\/search.cpan.org\/\u223cpalvaro\/Bloom-Faster-1.6\/<\/jats:p>\n Contacts: \u00a0henrik.stranneheim@biotech.kth.se; joakiml@biotech.kth.se<\/jats:p>\n Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq230","type":"journal-article","created":{"date-parts":[[2010,5,15]],"date-time":"2010-05-15T00:17:52Z","timestamp":1273882672000},"page":"1595-1600","source":"Crossref","is-referenced-by-count":56,"title":["Classification of DNA sequences using Bloom filters"],"prefix":"10.1093","volume":"26","author":[{"given":"Henrik","family":"Stranneheim","sequence":"first","affiliation":[{"name":"1 Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, 2 LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, 3 Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, 4 Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and 5 School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm, Sweden"}]},{"given":"Max","family":"K\u00e4ller","sequence":"additional","affiliation":[{"name":"1 Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, 2 LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, 3 Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, 4 Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and 5 School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm, Sweden"}]},{"given":"Tobias","family":"Allander","sequence":"additional","affiliation":[{"name":"1 Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, 2 LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, 3 Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, 4 Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and 5 School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm, Sweden"}]},{"given":"Bj\u00f6rn","family":"Andersson","sequence":"additional","affiliation":[{"name":"1 Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, 2 LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, 3 Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, 4 Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and 5 School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm, Sweden"}]},{"given":"Lars","family":"Arvestad","sequence":"additional","affiliation":[{"name":"1 Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, 2 LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, 3 Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, 4 Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and 5 School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm, Sweden"}]},{"given":"Joakim","family":"Lundeberg","sequence":"additional","affiliation":[{"name":"1 Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, 2 LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, 3 Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, 4 Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and 5 School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm, Sweden"}]}],"member":"286","published-online":{"date-parts":[[2010,5,13]]},"reference":[{"key":"2023012507551649200_B1","doi-asserted-by":"crossref","first-page":"11609","DOI":"10.1073\/pnas.211424698","article-title":"A virus discovery method incorporating DNase treatment and its application to the identification of two bovine parvovirus species","volume":"98","author":"Allander","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507551649200_B2","doi-asserted-by":"crossref","first-page":"12891","DOI":"10.1073\/pnas.0504666102","article-title":"Cloning of a human parvovirus by molecular screening of respiratory tract samples","volume":"102","author":"Allander","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507551649200_B3","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1145\/362686.362692","article-title":"Space\/time trade-offs in hash coding with allowable errors","volume":"13","author":"Bloom","year":"1970","journal-title":"Commun. ACM"},{"key":"2023012507551649200_B4","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1080\/15427951.2004.10129096","article-title":"Network applications of Bloom filters: a survey","volume":"1","author":"Broder","year":"2004","journal-title":"Internet Mathemathics"},{"key":"2023012507551649200_B5","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1038\/nbt1414","article-title":"A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis","volume":"26","author":"Down","year":"2008","journal-title":"Nat. Biotechnol."},{"key":"2023012507551649200_B6","first-page":"656","article-title":"BLAT\u2014the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023012507551649200_B7","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012507551649200_B8","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1038\/nature07485","article-title":"DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome","volume":"456","author":"Ley","year":"2008","journal-title":"Nature"},{"key":"2023012507551649200_B9","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows-Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012507551649200_B10","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"2023012507551649200_B11","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1093\/bioinformatics\/btn025","article-title":"SOAP: short oligonucleotide alignment program","volume":"24","author":"Li","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012507551649200_B12","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature03959","article-title":"Genome sequencing in microfabricated high-density picolitre reactors","volume":"437","author":"Margulies","year":"2005","journal-title":"Nature"},{"key":"2023012507551649200_B13","doi-asserted-by":"crossref","first-page":"1725","DOI":"10.1101\/gr.194201","article-title":"SSAHA: a fast search method for large DNA databases","volume":"11","author":"Ning","year":"2001","journal-title":"Genome Res."},{"key":"2023012507551649200_B14","doi-asserted-by":"crossref","first-page":"e3373","DOI":"10.1371\/journal.pone.0003373","article-title":"MetaSim: a sequencing simulator for genomics and metagenomics","volume":"3","author":"Richter","year":"2008","journal-title":"PLoS ONE"},{"key":"2023012507551649200_B15","doi-asserted-by":"crossref","first-page":"e1000386","DOI":"10.1371\/journal.pcbi.1000386","article-title":"SHRiMP: accurate mapping of short color-space reads","volume":"5","author":"Rumble","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012507551649200_B16","doi-asserted-by":"crossref","first-page":"e77","DOI":"10.1371\/journal.pbio.0050077","article-title":"The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific","volume":"5","author":"Rusch","year":"2007","journal-title":"PLoS Biol."},{"key":"2023012507551649200_B17","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1089\/10665270050081478","article-title":"A greedy algorithm for aligning DNA sequences","volume":"7","author":"Zhang","year":"2000","journal-title":"J. Comput. Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/13\/1595\/48851430\/bioinformatics_26_13_1595.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/13\/1595\/48851430\/bioinformatics_26_13_1595.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T07:55:47Z","timestamp":1674633347000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/13\/1595\/199936"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,5,13]]},"references-count":17,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2010,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq230","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,7,1]]},"published":{"date-parts":[[2010,5,13]]}}}