{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:45:25Z","timestamp":1740185125237,"version":"3.37.3"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2022,1,10]],"date-time":"2022-01-10T00:00:00Z","timestamp":1641772800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DMS-1903139","DMS-2015411"],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["12071243"],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,28]]},"abstract":"Abstract<\/jats:title>Motivation<\/jats:title>Polygenic risk score (PRS) has been widely exploited for genetic risk prediction due to its accuracy and conceptual simplicity. We introduce a unified Bayesian regression framework, NeuPred, for PRS construction, which accommodates varying genetic architectures and improves overall prediction accuracy for complex diseases by allowing for a wide class of prior choices. To take full advantage of the framework, we propose a summary-statistics-based cross-validation strategy to automatically select suitable chromosome-level priors, which demonstrates a striking variability of the prior preference of each chromosome, for the same complex disease, and further significantly improves the prediction accuracy.<\/jats:p><\/jats:sec>Results<\/jats:title>Simulation studies and real data applications with seven disease datasets from the Wellcome Trust Case Control Consortium cohort and eight groups of large-scale genome-wide association studies demonstrate that NeuPred achieves substantial and consistent improvements in terms of predictive r2 over existing methods. In addition, NeuPred has similar or advantageous computational efficiency compared with the state-of-the-art Bayesian methods.<\/jats:p><\/jats:sec>Availability and implementation<\/jats:title>The R package implementing NeuPred is available at https:\/\/github.com\/shuangsong0110\/NeuPred.<\/jats:p><\/jats:sec>Supplementary information<\/jats:title>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac024","type":"journal-article","created":{"date-parts":[[2022,1,9]],"date-time":"2022-01-09T12:06:32Z","timestamp":1641729992000},"page":"1938-1946","source":"Crossref","is-referenced-by-count":2,"title":["A data-adaptive Bayesian regression approach for polygenic risk prediction"],"prefix":"10.1093","volume":"38","author":[{"given":"Shuang","family":"Song","sequence":"first","affiliation":[{"name":"Center for Statistical Science, Tsinghua University , Beijing 100084, China"},{"name":"School of Life Sciences, Department of Industrial Engineering, Tsinghua University , Beijing 100084, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4283-8501","authenticated-orcid":false,"given":"Lin","family":"Hou","sequence":"additional","affiliation":[{"name":"Center for Statistical Science, Tsinghua University , Beijing 100084, China"},{"name":"School of Life Sciences, Department of Industrial Engineering, Tsinghua University , Beijing 100084, China"},{"name":"MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University , Beijing 100084, China"}]},{"given":"Jun S","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Statistics, Harvard University , Cambridge, MA 02138, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,1,10]]},"reference":[{"key":"2023020108591410900_btac024-B1","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1038\/nature09410","article-title":"Hundreds of variants clustered in genomic loci and biological pathways affect human height","volume":"467","author":"Allen","year":"2010","journal-title":"Nature"},{"key":"2023020108591410900_btac024-B2","doi-asserted-by":"crossref","first-page":"716","DOI":"10.1214\/aos\/1176345068","article-title":"A robust generalized Bayes estimator and confidence region for a multivariate normal mean","volume":"8","author":"Berger","year":"1980","journal-title":"Ann. Stat"},{"key":"2023020108591410900_btac024-B3","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1093\/bioinformatics\/btv546","article-title":"Approximately independent linkage disequilibrium blocks in human populations","volume":"32","author":"Berisa","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020108591410900_btac024-B4","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1038\/ng.3211","article-title":"LD score regression distinguishes confounding from polygenicity in genome-wide association studies","volume":"47","author":"Bulik-Sullivan","year":"2015","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B5","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/s13742-015-0047-8","article-title":"Second-generation PLINK: rising to the challenge of larger and richer datasets","volume":"4","author":"Chang","year":"2015","journal-title":"Gigascience"},{"key":"2023020108591410900_btac024-B6","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1038\/ng.2579","article-title":"Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies","volume":"45","author":"Chatterjee","year":"2013","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B7","first-page":"1","article-title":"A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information","volume":"116","author":"Chen","year":"2020","journal-title":"J. Am. Stat. Assoc"},{"key":"2023020108591410900_btac024-B8","doi-asserted-by":"crossref","first-page":"748","DOI":"10.1038\/nature08185","article-title":"Common polygenic variation contributes to risk of schizophrenia and bipolar disorder","volume":"460","author":"Consortium","year":"2009","journal-title":"Nature"},{"key":"2023020108591410900_btac024-B9","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1038\/ng.543","article-title":"Multiple common variants for celiac disease influencing immune gene expression","volume":"42","author":"Dubois","year":"2010","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B10","doi-asserted-by":"crossref","first-page":"e1003348","DOI":"10.1371\/journal.pgen.1003348","article-title":"Power and predictive accuracy of polygenic risk scores","volume":"9","author":"Dudbridge","year":"2013","journal-title":"PLoS Genet"},{"key":"2023020108591410900_btac024-B12","doi-asserted-by":"crossref","first-page":"1118","DOI":"10.1038\/ng.717","article-title":"Genome-wide meta-analysis increases to 71 the number of confirmed Crohn\u2019s disease susceptibility loci","volume":"42","author":"Franke","year":"2010","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B13","doi-asserted-by":"crossref","first-page":"e101428","DOI":"10.1371\/journal.pone.0101428","article-title":"Genome-wide association study of celiac disease in North America confirms FRMD4B as new celiac locus","volume":"9","author":"Garner","year":"2014","journal-title":"PLoS One"},{"key":"2023020108591410900_btac024-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-09718-5","article-title":"Polygenic prediction via Bayesian regression and continuous shrinkage priors","volume":"10","author":"Ge","year":"2019","journal-title":"Nat. Commun"},{"key":"2023020108591410900_btac024-B15","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1186\/1471-2105-12-186","article-title":"Extension of the Bayesian alphabet for genomic selection","volume":"12","author":"Habier","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023020108591410900_btac024-B16","doi-asserted-by":"crossref","first-page":"e1006836","DOI":"10.1371\/journal.pgen.1006836","article-title":"Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction","volume":"13","author":"Hu","year":"2017","journal-title":"PLoS Genet"},{"key":"2023020108591410900_btac024-B17","doi-asserted-by":"crossref","first-page":"e1005589","DOI":"10.1371\/journal.pcbi.1005589","article-title":"Leveraging functional annotations in genetic risk prediction for human complex diseases","volume":"13","author":"Hu","year":"2017","journal-title":"PLoS Comput. Biol"},{"key":"2023020108591410900_btac024-B18","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1080\/00031305.2020.1816213","article-title":"A set of efficient methods to generate high-dimensional binary data with specified correlation structures","volume":"75","author":"Jiang","year":"2021","journal-title":"Am. Stat"},{"key":"2023020108591410900_btac024-B19","doi-asserted-by":"crossref","first-page":"R182","DOI":"10.1093\/hmg\/ddr378","article-title":"Genetic risk prediction in complex disease","volume":"20","author":"Jostins","year":"2011","journal-title":"Hum. Mol. Genet"},{"key":"2023020108591410900_btac024-B20","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1016\/j.jmva.2015.04.006","article-title":"Spectrum estimation: a unified framework for covariance matrix estimation and PCA in large dimensions","volume":"139","author":"Ledoit","year":"2015","journal-title":"J. Multivar. Anal"},{"key":"2023020108591410900_btac024-B21","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1016\/j.csda.2017.06.004","article-title":"Numerical implementation of the QuEST function","volume":"115","author":"Ledoit","year":"2017","journal-title":"Comput. Stat. Data Anal"},{"key":"2023020108591410900_btac024-B22","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1093\/biomet\/87.2.353","article-title":"Generalised gibbs sampler and multigrid Monte Carlo for Bayesian computation","volume":"87","author":"Liu","year":"2000","journal-title":"Biometrika"},{"key":"2023020108591410900_btac024-B23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-12653-0","article-title":"Improved polygenic prediction by Bayesian multiple regression on summary statistics","volume":"10","author":"Lloyd-Jones","year":"2019","journal-title":"Nat. Commun"},{"key":"2023020108591410900_btac024-B24","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1002\/gepi.22050","article-title":"Polygenic scores via penalized regression on summary statistics","volume":"41","author":"Mak","year":"2017","journal-title":"Genet. Epidemiol"},{"key":"2023020108591410900_btac024-B25","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1056\/NEJMoa0906312","article-title":"A large-scale, consortium-based genomewide association study of asthma","volume":"363","author":"Moffatt","year":"2010","journal-title":"N. Engl. J. Med"},{"key":"2023020108591410900_btac024-B26","doi-asserted-by":"crossref","first-page":"981","DOI":"10.1038\/ng.2383","article-title":"Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes","volume":"44","author":"Morris","year":"2012","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B27","doi-asserted-by":"crossref","first-page":"e1004969","DOI":"10.1371\/journal.pgen.1004969","article-title":"Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model","volume":"11","author":"Moser","year":"2015","journal-title":"PLoS Genet"},{"year":"2020","key":"2023020108591410900_btac024-B28"},{"key":"2023020108591410900_btac024-B29","doi-asserted-by":"crossref","first-page":"570","DOI":"10.1038\/ng.610","article-title":"Estimation of effect size distribution from genome-wide association studies and implications for future discoveries","volume":"42","author":"Park","year":"2010","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B30","doi-asserted-by":"crossref","first-page":"e1008271","DOI":"10.1371\/journal.pcbi.1008271","article-title":"Penalized regression and model selection methods for polygenic scores on summary statistics","volume":"16","author":"Pattee","year":"2020","journal-title":"PLoS Comput. Biol"},{"key":"2023020108591410900_btac024-B31","doi-asserted-by":"crossref","first-page":"1033","DOI":"10.2307\/2531733","article-title":"Correlated binary regression with covariates specific to each binary observation","volume":"44","author":"Prentice","year":"1988","journal-title":"Biometrics"},{"key":"2023020108591410900_btac024-B32","doi-asserted-by":"crossref","first-page":"5424","DOI":"10.1093\/bioinformatics\/btaa1029","article-title":"LDpred2: better, faster, stronger","volume":"36","author":"Priv\u00e9","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020108591410900_btac024-B33","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1038\/ng.940","article-title":"Genome-wide association study identifies five new schizophrenia loci","volume":"43","author":"Ripke","year":"2011","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B34","first-page":"1","article-title":"Neuronized priors for Bayesian sparse linear regression","author":"Shin","year":"2021","journal-title":"J. Am. Stat. Assoc"},{"key":"2023020108591410900_btac024-B35","doi-asserted-by":"crossref","first-page":"e1007565","DOI":"10.1371\/journal.pcbi.1007565","article-title":"Leveraging effect size distributions to improve polygenic risk scores derived from summary statistics of genome-wide association studies","volume":"16","author":"Song","year":"2020","journal-title":"PLoS Comput. Biol"},{"key":"2023020108591410900_btac024-B36","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1038\/ng.582","article-title":"Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci","volume":"42","author":"Stahl","year":"2010","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B37","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.jclinepi.2019.09.016","article-title":"Validation of clinical prediction models: what does the \u201ccalibration slope\u201d really measure?","volume":"118","author":"Stevens","year":"2020","journal-title":"J. Clin. Epidemiol"},{"key":"2023020108591410900_btac024-B38","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1214\/aoms\/1177693528","article-title":"Proper Bayes minimax estimators of the multivariate normal mean","volume":"42","author":"Strawderman","year":"1971","journal-title":"Ann. Math. Stat"},{"key":"2023020108591410900_btac024-B39","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1038\/s41588-017-0011-x","article-title":"Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity","volume":"50","author":"Turcot","year":"2018","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B40","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1038\/s41588-017-0009-4","article-title":"Multi-trait analysis of genome-wide association summary statistics using MTAG","volume":"50","author":"Turley","year":"2018","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B41","doi-asserted-by":"crossref","first-page":"1296","DOI":"10.1016\/j.jclinepi.2013.06.003","article-title":"Calibration of clinical prediction rules does not just assess bias","volume":"66","author":"Vach","year":"2013","journal-title":"J. Clin. Epidemiol"},{"key":"2023020108591410900_btac024-B42","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.ajhg.2015.09.001","article-title":"Modeling linkage disequilibrium increases accuracy of polygenic risk scores","volume":"97","author":"Vilhj\u00e1lmsson","year":"2015","journal-title":"Am. J. Hum. Genet"},{"key":"2023020108591410900_btac024-B43","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.jclinepi.2020.06.002","article-title":"Calibration slope versus discrimination slope: shoes on the wrong feet","volume":"125","author":"Wang","year":"2020","journal-title":"J. Clin. Epidemiol"},{"key":"2023020108591410900_btac024-B44","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nature05911","article-title":"Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls","volume":"447","year":"2007","journal-title":"Nature"},{"key":"2023020108591410900_btac024-B45","doi-asserted-by":"crossref","first-page":"1173","DOI":"10.1038\/ng.3097","article-title":"Defining the role of common variation in the genomic and biological architecture of adult human height","volume":"46","author":"Wood","year":"2014","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B46","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1016\/j.ajhg.2020.03.013","article-title":"Accurate and scalable construction of polygenic scores in large biobank data sets","volume":"106","author":"Yang","year":"2020","journal-title":"Am. J. Hum. Genet"},{"key":"2023020108591410900_btac024-B47","doi-asserted-by":"crossref","first-page":"1318","DOI":"10.1038\/s41588-018-0193-x","article-title":"Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits","volume":"50","author":"Zhang","year":"2018","journal-title":"Nat. Genet"},{"key":"2023020108591410900_btac024-B48","doi-asserted-by":"crossref","first-page":"e1009697","DOI":"10.1371\/journal.pgen.1009697","article-title":"A fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics","volume":"17","author":"Zhou","year":"2021","journal-title":"PLoS Genet"},{"key":"2023020108591410900_btac024-B49","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.1214\/17-AOAS1046","article-title":"Bayesian large-scale multiple regression with summary statistics from genome-wide association studies","volume":"11","author":"Zhu","year":"2017","journal-title":"Ann. Appl. Stat"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac024\/42534412\/btac024.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/7\/1938\/49008929\/btac024.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/7\/1938\/49008929\/btac024.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,15]],"date-time":"2023-11-15T15:10:23Z","timestamp":1700061023000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/7\/1938\/6502278"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,1,10]]},"references-count":48,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2022,3,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac024","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2022,4,1]]},"published":{"date-parts":[[2022,1,10]]}}}