Abstract
This paper addresses the problem of estimating a density, with either a compact support or a support bounded at only one end, exploiting a general and natural form of a finite mixture of distributions. Due to the importance of the concept of multimodality in the mixture framework, unimodal beta and gamma densities are used as mixture components, leading to a flexible modeling approach. Accordingly, a mode-based parameterization of the components is provided. A partitional clustering method, named \(k\)-bumps, is also proposed; it is used as an ad hoc initialization strategy in the EM algorithm to obtain the maximum likelihood estimation of the mixture parameters. The performance of the \(k\)-bumps algorithm as an initialization tool, in comparison to other common initialization strategies, is evaluated through some simulation experiments. Finally, two real applications are presented.












Similar content being viewed by others
Notes
Downloadable from http://www.humanfertility.org/cgi-bin/main.php.
References
Altman E, Resti A, Sironi A (2005) Loss given default: a review of the literature. In: Altman E, Resti A, Sironi A (eds) The next challenge in credit risk management. Riskbooks, London
Banca d’Italia (2001) Principali Risultati della Rilevazione sull’Attività di Recupero dei Crediti. Bollettino di Vigilanza 12
Basel Committee on Banking Supervision (2004) International capital measurement and capital standards: a revised framework. Bank for International Settlements, Basel
Behboodian J (1970) On the modes of a mixture of two normal distributions. Technometrics 12(1):131–139
Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3):561–575
Brazier S, Sparks RSJ, Carey SN, Sigurdsson H, Westgate JA (1983) Bimodal grain size distribution and secondary thickening in air-fall ash layers. Nature 301:115–119
Bruche M, González-Aguado C (2010) Recovery rates, default probabilities, and the credit cycle. J Banking Financ 34(4):713–723
Calabrese R, Zenga M (2008) Measuring loan recovery rate: methodology and empirical evidence. Stat Appl VI(2):193–214
Calabrese R, Zenga M (2010) Bank loan recovery rates: measuring and nonparametric density estimation. J Banking Financ 34(5):903–911
Celeux G, Govaert G (1992) A classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14(3):315–332
Chen S (1999) Beta kernel estimators for density functions. Comput Stat Data Anal 31(2):131–145
Chen S (2000) Probability density function estimation using gamma kernels. Ann Inst Stat Math 52(3):471–480
Coale A (1971) Age patterns of marriage. Pop Stud 25(2):193–214
Congdon P (1993) Statistical graduation in local demographic analysis and projection. J R Stat Soc Ser A Stat Soc 156(2):237–270
Cox D (1966) Notes on the analysis of mixed frequency distributions. Br J Math Stat Psychol 19(1):39–47
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B Methodol 39(1):1–38
Diebolt J, Ip E (1996) Stochastic EM: method and application. In: Gilks W, Richardson S, Spiegelhalter D (eds) Markov chain Monte Carlo in practice, chap 15. Chapman and Hall, London, pp 259–273
Dye JL, (2008) Fertility of American women, 2006. Current Population Reports, US Census Bureau 20(558)
Eisenberger I (1964) Genesis of bimodal distributions. Technometrics 6(4):357–363
Elderton WP, Johnson NL (1969) Systems of frequency curves. Cambridge University Press, Cambridge
Everitt B, Hand DJ (1981) Finite mixture distributions. Chapman and Hall, London
Ghosal S (2001) Convergence rates for density estimation with Bernstein polynomials. Ann Stat 29(5):1264–1280
Gupton G, Stein R (2002) LossCalc: Moody’s model for predicting loss given default (LGD). Moody’s Investors Service, New York
Gupton G, Finger C, Bhatia M (1997) CreditMetrics—technical document. J. P. Morgan and Co, New York
Huang Z (1998) Extensions to the \(k\)-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304
Izenman AJ (2008) Modern multivariate statistical techniques: regression, classification, and manifold Learning. Springer, New York
Ji Y, Wu C, Liu P, Wang J, Coombes K (2005) Applications of beta-mixture models in bioinformatics. Bioinformatics 21(9):2118–2122
Johnson NL, Kotz S (1970a) Continuous univariate distributions, vol 1. Wiley, New York
Johnson NL, Kotz S (1970b) Continuous univariate distributions, vol 2. Wiley, New York
Jordan MI, Xu L (1995) Convergence results for the EM approach to mixtures of experts architectures. Neural Netw 8(9):1409–1431
Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis, vol 39. Wiley, New York
Kendall MG, Stuart A (1958) The advanced theory of statistics, vol 1. Charles Griffin and Company Limited, London
Lee S, Sheldon Lin X (2010) Modeling and evaluating insurance losses via mixtures of Erlang distributions. N Am Actuar J 14(1):107–130
Leisch F (2004) FlexMix: a general framework for finite mixture models and latent class regression in R. J Stat Softw 11(8):1–18
Lindsay B (1995) Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward
Martin JA, Hamilton BE, Sutton PD, Ventura SJ, Menacker F, Kirmeyer S, Mathews T (2009) Births: final data for 2006. Natl Vital Stat Rep 57(7):1–104
Maulik U, Bandyopadhyay S, Mukhopadhyay A (2011) Multiobjective genetic algorithm-based fuzzy clustering: applications in data mining and bioinformatics. Springer, Berlin
Mayrose I, Friedman N, Pupko T (2005) A gamma mixture model better accounts for among site rate heterogeneity. Bioinformatics 21(2):151–158
Mazza A, Punzo A (2011) Discrete beta kernel graduation of age-specific demographic indicators. In: Ingrassia S, Rocci R, Vichi M (eds) New perspectives in statistical modeling and data analysis (Studies in classification, data analysis and knowledge organization), vol 42. Springer, Berlin, pp 127–134
Mazza A, Punzo A (2013a) Graduation by adaptive discrete beta kernels. In: Giusti A, Ritter G, Vichi M (eds) Classification and data mining (Studies in classification, data analysis and knowledge organization), vol 44. Springer, Berlin, pp 77–84
Mazza A, Punzo A (2013b) Using the variation coefficient for adaptive discrete beta kernel graduation. In: Giudici P, Ingrassia S, Vichi M (eds) Studies in classification, data analysis and knowledge organization. Springer, Berlin (in press)
McLachlan G, Krishnan T (2007) The EM algorithm and extensions. Wiley, New York
McLachlan GJ, Basford KE (1988) Mixture models—inference and applications to clustering. Marcel Dekker, New York
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Meilă M, Heckerman D (2001) An experimental comparison of model-based clustering methods. Mach Learn 42(1):9–29
Murphy EA (1964) One cause? Many causes? the argument from the bimodal distribution. J Chronic Dis 17(4):301–324
Pearson K (1902a) On the systematic fitting of curves to observations and measurements. Biometrika 1(3):265–303
Pearson K (1902b) On the systematic fitting of curves to observations and measurements: part II. Biometrika 2(1):1–23
Petrone S (1999a) Bayesian density estimation using Bernstein polynomials. Can J Stat 27(1):105–126
Petrone S (1999b) Random Bernstein polynomials. Scand J Stat 26(3):373–393
Punzo A (2010) Discrete beta-type models. In: Locarek-Junge H, Weihs C (eds) Classification as a tool for research (Studies in classification, data analysis and knowledge organization), vol 40. Springer, Berlin, pp 253–261
Punzo A, Zini A (2012) Discrete approximations of continuous and mixed measures on a compact interval. Stat Pap 53(3):563–575
Ray S, Lindsay B (2005) The topography of multivariate normal mixtures. Ann Stat 33(5):2042–2065
R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/, ISBN 3-900051-07-0
Redner RA, Walker HF (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2):195–239
Robertson C, Fryer J (1969) Some descriptive properties of normal mixtures. Skand Aktuarietidskr 52: 137–146
Rogers A (1986) Parameterized multistate population dynamics and projections. J Am Stat Assoc 81(393):48–61
Scharl T, Grün B, Leisch F (2010) Mixtures of regression models for time course gene expression data: evaluation of initialization and random effects. Bioinformatics 26(3):370–377
Schilling M, Watkins A, Watkins W (2002) Is human height bimodal? Am Stat 56(3):223–229
Silverman B (1981) Using kernel density estimates to investigate multimodality. J R Stat Soc Ser B Methodol 43:97–99
Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, New York
Wessels J (1964) Multimodality in a family of probability densities, with application to a linear mixture of two normal densities. Statistica Neerlandica 18(3):267–282
Wiper M, Insua DR, Ruggeri F (2001) Mixtures of gamma distributions with applications. J Comput Graph Stat 10(3):440–454
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Parameterization genesis
If a density function \(f\) is chosen to belong to the Pearson system, then it is the solution of the differential equation
It is clear that some of the solutions to (18) have a single mode (\(df/dx=0\) at \(x=m\)) and smooth contact with the horizontal axis (\(df/dx=0\) when \(f\left(x\right)=0\)).
The shape of \(f\) depends on the set of parameters \(\left(m,c_0,c_1,c_2\right)\), and the form of the solution of (18) evidently depends on the nature of the roots of the equation
The classes of unimodal gamma and beta densities, illustrated in Sect. 2, arise from a convenient choice of these roots.
1.1 Unimodal gamma densities
Consider \(c_2=0\) and \(c_1>0\). Thus, Eq. (18) becomes
in which
In order to make (21) a density function,
so that the result is a gamma distribution (see Johnson and Kotz 1970a, Chapter 17). Equation (2) is obtained by setting \(-c_0/c_1=a\) and \(c_1=v\).
1.2 Unimodal beta densities
Suppose that both the roots of (19) are real. Denoting these roots as \(a\) and \(b\), with \(a<b\), it follows that
consequently, Eq. (18) becomes
in which
In order to make (23) a density function,
so that the result is a beta distribution (see Johnson and Kotz 1970b, Chapter 24). If a unimodal beta density must be considered, it is necessary that \(c_2\le 0\). Equation (6) is obtained by setting \(-c_2=v\) in (23).
Details on the EM algorithm
Here we attempt to make explicit the derivatives in (14) for both gamma and beta densities parameterized according to (2) and (6), respectively. We recall that the resulting ML-estimates do not have a closed-form expression and can only be computed numerically, with the aid of an iterative algorithm; such numerical methods are available in most computer software, such as Mathematica and R.
In detail, for the gamma density in (2) we have
and
where \(\psi \left(\cdot \right)\) is the digamma function. In the same way, for the beta density in (6) we have
and
Rights and permissions
About this article
Cite this article
Bagnato, L., Punzo, A. Finite mixtures of unimodal beta and gamma densities and the \(k\)-bumps algorithm. Comput Stat 28, 1571–1597 (2013). https://doi.org/10.1007/s00180-012-0367-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-012-0367-4