Abstract
Estimating a joint Highest Posterior Density credible set for a multivariate posterior density is challenging as dimension gets larger. Credible intervals for univariate marginals are usually presented for ease of computation and visualisation. There are often two layers of approximation, as we may need to compute a credible set for a target density which is itself only an approximation to the true posterior density. We obtain joint Highest Posterior Density credible sets for density estimation trees given by Li et al. (in: Lee, Sugiyama, Luxburg, Guyon, Garnett (eds) Advances in neural information processing systems, Curran Associates Inc, Red Hook, 2016) approximating a density truncated to a compact subset of \(\mathbb {R}^d\) as this is preferred to a copula construction. These trees approximate a joint posterior distribution from posterior samples using a piecewise constant function defined by sequential binary splits. We use a consistent estimator to measure of the symmetric difference between our credible set estimate and the true HPD set of the target density samples. This quality measure can be computed without the need to know the true set. We show how the true-posterior-coverage of an approximate credible set estimated for an approximate target density may be estimated in doubly intractable cases where posterior samples are not available. We illustrate our methods with simulation studies and find that our estimator is competitive with existing methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aitkin, M., Wilson, G.T.: Mixture models, outliers, and the EM algorithm. Technometrics 22, 325–331 (1980)
Anderson, A.J., Smith, I.W.G., Higham, T.F.G.: Shag river mouth: the archaeology of an early Southern Maori village. In: Anderson, A.J., Allingham, B., Smith, I.W.G. (Eds.), Shag River Mouth, Volume 27, pp. 61–69. Canberra: Archaeology and Natural History Publications (1996)
Baillo, A., Cuevas, A.: Image estimators based on marked bins. Statistics 40(4), 277–288 (2006)
Baillo, A., Cuevas, A., Justel, A.: Set estimation and nonparametric detection. Can. J. Stat. 28(4), 765–782 (2000)
Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002)
Bernardo, J.M.: Intrinsic credible regions: an objective Bayesian approach to interval estimation. TEST 14(2), 317–384 (2005)
Besag, J., Green, P., Higdon, D., Mengersen, K.: Bayesian computation and stochastic systems. Stat. Sci. 10(1), 3–41 (1995)
Blum, M.G.B.: Approximate Bayesian computation: a nonparametric perspective. J. Am. Stat. Assoc. 105, 1178–1187 (2010)
Box, G.E.P., Tiao, G.C.: Multiparameter problems from a Bayesian point of view. Ann. Math. Stat. 36, 1468–1482 (1965)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth adn Brooks, Monterey (1984)
Cadre, B.: Kernel estimation of density level sets. J. Multivar. Anal. 97, 999–1023 (2006)
Chen, M.-H., Shao, Q.-M.: Monte Carlo estimation of Bayesian credible and HPD intervals. J. Comput. Graph. Stat. 8(1), 69–92 (1999)
Chen, Y.-C., Genovese, C.R., Wasserman, L.: Density level sets: asymptotics, inference and visualization. J. Am. Stat. Assoc. 112(520), 1684–1696 (2017)
Chipman, H.A., George, E.I., McCulloch, R.E.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4, 266–298 (2010)
Druilhet, P., Marin, J.-M.: Invariant HPD credible sets and MAP estimators. Bayesian Anal. 2(4), 681–691 (2007)
Durante, D., Rigon, T.: Conditionally conjugate mean-field variational Bayes for logistic models. Stat. Sci. 34, 472–485 (2019)
Held, L.: Simultaneous posterior probability statements from Monte Carlo output. J. Comput. Graph. Stat. 13, 20–35 (2004)
Hogg, A.G., Hua, Q., Blackwell, P.G., Niu, M., Buck, C.E., Guilderson, T.P., Heaton, T.J., Palmer, J.G., Reimer, P.J., Reimer, R.W., Turney, C.S., Zimmerman, S.R.: SHCal13 Southern Hemisphere Calibration, 0–50,000 years cal bp. Radiocarbon 55(4), 1889–1903 (2013)
Jaakkola, T., Jorda, M.I.: Bayesian parameter estimation via variational methods. Stat. Comput. 10, 25–37 (2000)
Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37, 183–233 (1999)
Klemelä, J.: Visualization of multivariate density estimates with level set trees. J. Comput. Graph. Stat. 13(3), 599–620 (2004)
Krivobokova, T., Kneib, T., Claeskens, G.: Simultaneous confidence bands for penalized spline estimators. J. Am. Stat. Assoc. 105, 852–863 (2010)
Larochelle, H., Murray, I.: The neural autoregressive distribution estimator. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 29–37 (2011)
Lee, J., Nicholls, G.K., Ryder, R.J.: Calibration procedures for approximate Bayesian credible sets. Bayesian Anal. 14(4), 1245–1269 (2019)
Li, D., Yang, K., Wong, W.-H.: Density estimation via discrepancy based adaptive sequential partition. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (Eds.), Advances in Neural Information Processing Systems 29, pp. 1091–1099. Curran Associates Inc (2016)
Liu, L., Li, D., Wong,W.-H.: Convergence rates of a partition based Bayesian multivariate density estimation method. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (Eds.), Advances in Neural Information Processing Systems 30, pp. 4738–4746. Curran Associates Inc (2017)
Lu, L., Jiang, H., Wong, W.-H.: Multivariate density estimation by Bayesian sequential partitioning. J. Am. Stat. Assoc. 108(504), 1402–1410 (2013)
Magdon-Ismail, M., Atiya, A.F.: Neural networks for density estimation. In: Kearns, M.J., Solla, S.A., Cohn, D.A. (Eds.), Advances in Neural Information Processing Systems 11, pp. 522–528 (1999)
Mammen, E., Polonik, W.: Confidence regions for level sets. J. Multivar. Anal. 122, 202–214 (2013)
Marjoram, P.: Approximation Bayesian Computation. OA. Genetics 1(3), 853–862 (2013)
Mason, D.M., Polonik, W.: Asymptotic normality of plug-in level set estimates. Ann. Appl. Probab. 19(3), 1108–1142 (2009)
McLachlan, G. Krishnan, T.: The EM Algorithm and Extensions, 2Ed, Volume 8. John Wiley & Sons (2008)
Nicholls, G.K., Jones, M.: Radiocarbon dating with temporal order constraints. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 50(4), 503–521 (2001)
Niederreiter, H.: Quasi-Monte Carlo methods and pseudo-random numbers. Bull. Am. Math. Soc. 84(6), 957–1038 (1978)
Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. Society for Industrial and Applied Mathematics (1992)
Ormerod, J.T., Wand, M.P.: Explaining variational approximations. Am. Stat. 64, 140–153 (2010)
Owen, A.B.: Multidimensional variation for Quasi-Monte Carlo. In: International Conference on Statistics in honor of Professor Kai-Tai Fang’s 65th birthday, pp. 49–74 (2005)
Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (Eds.), Advances in Neural Information Processing Systems 30, pp. 2338–2347 (2017)
Pearson, K.: Contributions to the mathematical theory of evolution. Philos. Trans. R. Soc. Lond. (A) 185, 71–110 (1894)
Postman, M., Huchra, J.P., Geller, M.J.: Probes of large-scale structures in the Corona Borealis region. Astron. J. 92, 1238–1247 (1986)
Ram, P. Gray, A.G.: Density estimation trees. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 627–635 (2011)
Rousseau, J., Robert, C.P.: Discussion on a paper of J. Bernardo: Intrinsic credible regions; an objective Bayesian approach to interval estimation. Test 14(2), 367–369 (2005)
Roy, D.M., Teh, Y.W.: The Mondrian process. In: Proceedings of the 21st International Conference on Neural Information Processing Systems, NIPS’08, pp. 1377–1384. Curran Associates Inc (2008)
Scott, D.W.: On optimal and data-based histograms. Biometrika 66, 605–610 (1979)
Scott, D.W.: Frequency polygons: theory and application. J. Am. Stat. Assoc. 80, 348–354 (1985)
Scott, D.W., Tapia, J.R., Thompson, R.A.: Kernel density estimation revisited. J. Nonlinear Anal. Theory Methods Appl. 1, 339–372 (1977)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
Sklar, A.: Fonctions de repartition an dimensions et leurs marges, Volume 8. Inst. Statist. Univ. Paris (1959)
Sørbye, S.H., Rue, H.: Simultaneous credible bands for latent Gaussian models. Scand. J. Stat. 38, 712–725 (2011)
Stone, C.J.: The use of polynomial splines and their tensor products in multivariate function estimation. Ann. Stat. 22, 118–171 (1994)
Stuetzle, W., Nugent, R.: A generalized single linkage method for estimating the cluster tree of a density. J. Comput. Graph. Stat. 19(2), 397–418 (2010)
Thulin, M.: Decision-theoretic justifications for Bayesian hypothesis testing using credible sets. J. Stat. Plan. Inference 146, 133–138 (2014)
Tsybakov, A.B.: On nonparametric estimation of density level sets. Ann. Stat. 25, 948–969 (1997)
Wang, X., Wang, Y.: Nonparametric multivariate density estimation using mixtures. Stat. Comput. 25, 349–364 (2015)
Wegmann, D., Leuenberger, C., Excoffier, L.: Efficient approximate Bayesian computation coupled with Markov chain Monte Carlo without likelihood. Genetics 182, 1207–1218 (2009)
Wu, K., Hou, W., Yang, H.: Density estimation via the Random-Forest method. Comput. Stat. Theory Methods 47(4), 877–889 (2018)
Xing, H., Nicholls, G.K., Lee, J.E.: Calibrated approximate Bayesian inference. In: Proceedings of the 36th International Conference on Machine Learning, PMLR, Volume 97, pp. 6912–6920 (2019)
Xing, H., Nicholls, G.K., Lee, J.E.: Distortion estimates for approximate Bayesian inference. In: Peters, J., Sontag, D. (Eds.), Proceedings of Machine Learning Research, Volume 124, pp. 1208–1217. PMLR (2020)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The density estimation tree algorithm of Li et al. (2016) and the maximum gap calculation it uses are given in this “Appendix”.
1.1 Maximum gap calculation
In order to find a good split for leaf \(\Delta _k=[\pmb a^{(k)},\pmb b^{(k)}]\), defined in Step 5 of Algorithm 3, and given a set of points \(\{{\tilde{\pmb s}}^{(k,j)}\}_{j=1}^{n_k}\) with \({\tilde{\pmb s}}^{(k,j)}=({\tilde{s}}^{(k,j)}_{1},\ldots ,{\tilde{s}}^{(k,j)}_{d})\), we divide the i-th dimension into \(m_g\) equal-sized bins, \([a_{k,i}+(l-1)\delta _{k,i},a_{k,i}+l\delta _{k,i}]\), \(i=1,\ldots ,d\) and \(l=1,\ldots ,m_g\) where \(\delta _{k,i}=(b_{k,i}-a_{k,i})/m_g\). There are in total \((m_g-1)d\) gaps. Each gap is defined by \(h_{l,i}=| (1/n_k) \sum ^{n_k}_{j=1} \mathbbm {1}( {\tilde{s}}^{(k,j)}_i < a_{k,i}+l\delta _{k,i}) -l/m_g | \), \(l=1,\ldots ,(m_g-1)\) and \(i=1,\ldots ,d\). The splitting hyperplane is the gap with the maximum h-value.
Rights and permissions
About this article
Cite this article
Lee, J.E., Nicholls, G.K. Tree based credible set estimation. Stat Comput 31, 69 (2021). https://doi.org/10.1007/s11222-021-10045-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-021-10045-3