{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,3]],"date-time":"2024-09-03T14:22:31Z","timestamp":1725373351276},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":437,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,11,15]]},"abstract":"Abstract<\/jats:title>Motivation: High-throughput RNA sequencing (RNA-seq) is now the standard method to determine differential gene expression. Identifying differentially expressed genes crucially depends on estimates of read-count variability. These estimates are typically based on statistical models such as the negative binomial distribution, which is employed by the tools edgeR, DESeq and cuffdiff. Until now, the validity of these models has usually been tested on either low-replicate RNA-seq data or simulations.<\/jats:p>Results: A 48-replicate RNA-seq experiment in yeast was performed and data tested against theoretical models. The observed gene read counts were consistent with both log-normal and negative binomial distributions, while the mean-variance relation followed the line of constant dispersion parameter of \u223c0.01. The high-replicate data also allowed for strict quality control and screening of \u2018bad\u2019 replicates, which can drastically affect the gene read-count distribution.<\/jats:p>Availability and implementation: RNA-seq data have been submitted to ENA archive with project ID PRJEB5348.<\/jats:p>Contact: \u00a0g.j.barton@dundee.ac.uk<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv425","type":"journal-article","created":{"date-parts":[[2015,7,24]],"date-time":"2015-07-24T00:53:42Z","timestamp":1437699222000},"page":"3625-3630","source":"Crossref","is-referenced-by-count":75,"title":["Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment"],"prefix":"10.1093","volume":"31","author":[{"given":"Marek","family":"Gierli\u0144ski","sequence":"first","affiliation":[{"name":"1 Division of Computational Biology and"},{"name":"2 Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street Dundee, DD1 5EH, UK,"}]},{"given":"Christian","family":"Cole","sequence":"additional","affiliation":[{"name":"1 Division of Computational Biology and"}]},{"given":"Piet\u00e0","family":"Schofield","sequence":"additional","affiliation":[{"name":"1 Division of Computational Biology and"},{"name":"2 Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street Dundee, DD1 5EH, UK,"}]},{"given":"Nicholas J.","family":"Schurch","sequence":"additional","affiliation":[{"name":"1 Division of Computational Biology and"}]},{"given":"Alexander","family":"Sherstnev","sequence":"additional","affiliation":[{"name":"1 Division of Computational Biology and"}]},{"given":"Vijender","family":"Singh","sequence":"additional","affiliation":[{"name":"2 Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street Dundee, DD1 5EH, UK,"}]},{"given":"Nicola","family":"Wrobel","sequence":"additional","affiliation":[{"name":"3 Edinburgh Genomics and"}]},{"given":"Karim","family":"Gharbi","sequence":"additional","affiliation":[{"name":"3 Edinburgh Genomics and"},{"name":"4 Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh, Edinburgh, UK,"}]},{"given":"Gordon","family":"Simpson","sequence":"additional","affiliation":[{"name":"5 Division of Plant Sciences and"}]},{"given":"Tom","family":"Owen-Hughes","sequence":"additional","affiliation":[{"name":"2 Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street Dundee, DD1 5EH, UK,"}]},{"given":"Mark","family":"Blaxter","sequence":"additional","affiliation":[{"name":"3 Edinburgh Genomics and"},{"name":"4 Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh, Edinburgh, UK,"}]},{"given":"Geoffrey J.","family":"Barton","sequence":"additional","affiliation":[{"name":"1 Division of Computational Biology and"},{"name":"2 Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street Dundee, DD1 5EH, UK,"},{"name":"6 Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee, Dow Street Dundee, DD1 5EH, UK"}]}],"member":"286","published-online":{"date-parts":[[2015,7,23]]},"reference":[{"key":"2023020202414412900_btv425-B1","doi-asserted-by":"crossref","first-page":"R106","DOI":"10.1186\/gb-2010-11-10-r106","article-title":"Differential expression analysis for sequence count data","volume":"11","author":"Anders","year":"2010","journal-title":"Genome Biol."},{"key":"2023020202414412900_btv425-B2","first-page":"166","article-title":"HTSeq\u2014a Python framework to work with high-throughput sequencing data","volume-title":"Bioinformatics","author":"Anders","year":"2014"},{"key":"2023020202414412900_btv425-B3","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1534\/genetics.110.114983","article-title":"Statistical design and analysis of RNA sequencing data","volume":"185","author":"Auer","year":"2010","journal-title":"Genetics"},{"key":"2023020202414412900_btv425-B4","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser B Methodol."},{"key":"2023020202414412900_btv425-B5","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1080\/00031305.1990.10475751","article-title":"A suggestion for using powerful and informative tests of normality","volume":"44","author":"D'Agostino","year":"1990","journal-title":"Am. Stat."},{"key":"2023020202414412900_btv425-B6","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1093\/bib\/bbs046","article-title":"A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis","volume":"14","author":"Dillies","year":"2013","journal-title":"Brief Bioinform"},{"key":"2023020202414412900_btv425-B7","doi-asserted-by":"crossref","first-page":"17","DOI":"10.2307\/3001420","article-title":"The significance of deviations from expectation in a poisson series","volume":"6","author":"Fisher","year":"1950","journal-title":"Biometrics"},{"key":"2023020202414412900_btv425-B8","doi-asserted-by":"crossref","first-page":"D800","DOI":"10.1093\/nar\/gkq1064","article-title":"Ensembl 2011","volume":"39","author":"Flicek","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023020202414412900_btv425-B9","doi-asserted-by":"crossref","first-page":"572","DOI":"10.1038\/nbt.1910","article-title":"Sequencing technology does not eliminate biological variability","volume":"29","author":"Hansen","year":"2011","journal-title":"Nat. Biotechnol."},{"key":"2023020202414412900_btv425-B10","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1186\/1471-2105-11-422","article-title":"baySeq: empirical Bayesian methods for identifying differential expression in sequence count data","volume":"11","author":"Hardcastle","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023020202414412900_btv425-B11","doi-asserted-by":"crossref","first-page":"1543","DOI":"10.1101\/gr.121095.111","article-title":"Synthetic spike-in standards for RNA-seq experiments","volume":"21","author":"Jiang","year":"2011","journal-title":"Genome Res."},{"key":"2023020202414412900_btv425-B12","doi-asserted-by":"crossref","first-page":"R36","DOI":"10.1186\/gb-2013-14-4-r36","article-title":"TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions","volume":"14","author":"Kim","year":"2013","journal-title":"Genome Biol."},{"key":"2023020202414412900_btv425-B13","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1016\/j.cell.2012.10.012","article-title":"Revisiting global gene expression analysis","volume":"151","author":"Loven","year":"2012","journal-title":"Cell"},{"key":"2023020202414412900_btv425-B14","doi-asserted-by":"crossref","first-page":"1509","DOI":"10.1101\/gr.079558.108","article-title":"RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays","volume":"18","author":"Marioni","year":"2008","journal-title":"Genome Res."},{"key":"2023020202414412900_btv425-B15","first-page":"293","article-title":"Transform methods for testing the negative binomial hypothesis","volume":"65","author":"Meintanis","year":"2005","journal-title":"Statistica"},{"key":"2023020202414412900_btv425-B16","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1038\/nmeth.1226","article-title":"Mapping and quantifying mammalian transcriptomes by RNA-Seq","volume":"5","author":"Mortazavi","year":"2008","journal-title":"Nat. Methods"},{"key":"2023020202414412900_btv425-B17","doi-asserted-by":"crossref","DOI":"10.1002\/0471142727.mb0411s89","article-title":"RNA-Seq: a method for comprehensive transcriptome analysis","author":"Nagalakshmi","year":"2010","journal-title":"Curr. Protoc. Mol. Biol"},{"key":"2023020202414412900_btv425-B18","doi-asserted-by":"crossref","first-page":"R95","DOI":"10.1186\/gb-2013-14-9-r95","article-title":"Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data","volume":"14","author":"Rapaport","year":"2013","journal-title":"Genome Biol."},{"key":"2023020202414412900_btv425-B19","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2010-11-3-r25","article-title":"A scaling normalization method for differential expression analysis of RNA-seq data","volume":"11","author":"Robinson","year":"2010","journal-title":"Genome Biol."},{"key":"2023020202414412900_btv425-B20","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1093\/biostatistics\/kxm030","article-title":"Small-sample estimation of negative binomial dispersion, with applications to SAGE data","volume":"9","author":"Robinson","year":"2008","journal-title":"Biostatistics"},{"key":"2023020202414412900_btv425-B21","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1093\/bioinformatics\/btp616","article-title":"edgeR: a bioconductor package for differential expression analysis of digital gene expression data","volume":"26","author":"Robinson","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020202414412900_btv425-B22","article-title":"Evaluation of differential gene expression tools on a two-condition, 48 replicate RNA-seq experiment in preparation (preprint arXiv:1505.02017)","author":"Schurch","year":"2015"},{"key":"2023020202414412900_btv425-B23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2202\/1544-6115.1027","article-title":"Linear models and empirical bayes methods for assessing differential expression in microarray experiments","volume":"3","author":"Smyth","year":"2004","journal-title":"Stat. Appl. Genet. Mol. Biol."},{"key":"2023020202414412900_btv425-B24","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/nbt.1621","article-title":"Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation","volume":"28","author":"Trapnell","year":"2010","journal-title":"Nat. Biotechnol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/22\/3625\/49036194\/bioinformatics_31_22_3625.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/22\/3625\/49036194\/bioinformatics_31_22_3625.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,10]],"date-time":"2024-06-10T04:47:24Z","timestamp":1717994844000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/22\/3625\/240923"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,7,23]]},"references-count":24,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2015,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv425","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,11,15]]},"published":{"date-parts":[[2015,7,23]]}}}