Finite mixtures of unimodal beta and gamma densities and the $$k$$ -bumps algorithm | Computational Statistics Skip to main content
Log in

Finite mixtures of unimodal beta and gamma densities and the \(k\)-bumps algorithm

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

This paper addresses the problem of estimating a density, with either a compact support or a support bounded at only one end, exploiting a general and natural form of a finite mixture of distributions. Due to the importance of the concept of multimodality in the mixture framework, unimodal beta and gamma densities are used as mixture components, leading to a flexible modeling approach. Accordingly, a mode-based parameterization of the components is provided. A partitional clustering method, named \(k\)-bumps, is also proposed; it is used as an ad hoc initialization strategy in the EM algorithm to obtain the maximum likelihood estimation of the mixture parameters. The performance of the \(k\)-bumps algorithm as an initialization tool, in comparison to other common initialization strategies, is evaluated through some simulation experiments. Finally, two real applications are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Downloadable from http://www.humanfertility.org/cgi-bin/main.php.

References

  • Altman E, Resti A, Sironi A (2005) Loss given default: a review of the literature. In: Altman E, Resti A, Sironi A (eds) The next challenge in credit risk management. Riskbooks, London

    Google Scholar 

  • Banca d’Italia (2001) Principali Risultati della Rilevazione sull’Attività di Recupero dei Crediti. Bollettino di Vigilanza 12

  • Basel Committee on Banking Supervision (2004) International capital measurement and capital standards: a revised framework. Bank for International Settlements, Basel

    Google Scholar 

  • Behboodian J (1970) On the modes of a mixture of two normal distributions. Technometrics 12(1):131–139

    Article  MATH  Google Scholar 

  • Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3):561–575

    Article  MathSciNet  Google Scholar 

  • Brazier S, Sparks RSJ, Carey SN, Sigurdsson H, Westgate JA (1983) Bimodal grain size distribution and secondary thickening in air-fall ash layers. Nature 301:115–119

    Article  Google Scholar 

  • Bruche M, González-Aguado C (2010) Recovery rates, default probabilities, and the credit cycle. J Banking Financ 34(4):713–723

    Article  Google Scholar 

  • Calabrese R, Zenga M (2008) Measuring loan recovery rate: methodology and empirical evidence. Stat Appl VI(2):193–214

    Google Scholar 

  • Calabrese R, Zenga M (2010) Bank loan recovery rates: measuring and nonparametric density estimation. J Banking Financ 34(5):903–911

    Article  Google Scholar 

  • Celeux G, Govaert G (1992) A classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14(3):315–332

    Article  MathSciNet  MATH  Google Scholar 

  • Chen S (1999) Beta kernel estimators for density functions. Comput Stat Data Anal 31(2):131–145

    Article  MATH  Google Scholar 

  • Chen S (2000) Probability density function estimation using gamma kernels. Ann Inst Stat Math 52(3):471–480

    Article  MATH  Google Scholar 

  • Coale A (1971) Age patterns of marriage. Pop Stud 25(2):193–214

    Google Scholar 

  • Congdon P (1993) Statistical graduation in local demographic analysis and projection. J R Stat Soc Ser A Stat Soc 156(2):237–270

    Article  Google Scholar 

  • Cox D (1966) Notes on the analysis of mixed frequency distributions. Br J Math Stat Psychol 19(1):39–47

    Article  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B Methodol 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Diebolt J, Ip E (1996) Stochastic EM: method and application. In: Gilks W, Richardson S, Spiegelhalter D (eds) Markov chain Monte Carlo in practice, chap 15. Chapman and Hall, London, pp 259–273

    Google Scholar 

  • Dye JL, (2008) Fertility of American women, 2006. Current Population Reports, US Census Bureau 20(558)

  • Eisenberger I (1964) Genesis of bimodal distributions. Technometrics 6(4):357–363

    Article  MathSciNet  Google Scholar 

  • Elderton WP, Johnson NL (1969) Systems of frequency curves. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Everitt B, Hand DJ (1981) Finite mixture distributions. Chapman and Hall, London

    Book  MATH  Google Scholar 

  • Ghosal S (2001) Convergence rates for density estimation with Bernstein polynomials. Ann Stat 29(5):1264–1280

    Article  MathSciNet  MATH  Google Scholar 

  • Gupton G, Stein R (2002) LossCalc: Moody’s model for predicting loss given default (LGD). Moody’s Investors Service, New York

    Google Scholar 

  • Gupton G, Finger C, Bhatia M (1997) CreditMetrics—technical document. J. P. Morgan and Co, New York

    Google Scholar 

  • Huang Z (1998) Extensions to the \(k\)-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304

    Article  Google Scholar 

  • Izenman AJ (2008) Modern multivariate statistical techniques: regression, classification, and manifold Learning. Springer, New York

    Book  Google Scholar 

  • Ji Y, Wu C, Liu P, Wang J, Coombes K (2005) Applications of beta-mixture models in bioinformatics. Bioinformatics 21(9):2118–2122

    Article  Google Scholar 

  • Johnson NL, Kotz S (1970a) Continuous univariate distributions, vol 1. Wiley, New York

    MATH  Google Scholar 

  • Johnson NL, Kotz S (1970b) Continuous univariate distributions, vol 2. Wiley, New York

    MATH  Google Scholar 

  • Jordan MI, Xu L (1995) Convergence results for the EM approach to mixtures of experts architectures. Neural Netw 8(9):1409–1431

    Article  Google Scholar 

  • Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis, vol 39. Wiley, New York

    Book  Google Scholar 

  • Kendall MG, Stuart A (1958) The advanced theory of statistics, vol 1. Charles Griffin and Company Limited, London

    Google Scholar 

  • Lee S, Sheldon Lin X (2010) Modeling and evaluating insurance losses via mixtures of Erlang distributions. N Am Actuar J 14(1):107–130

    Article  MathSciNet  Google Scholar 

  • Leisch F (2004) FlexMix: a general framework for finite mixture models and latent class regression in R. J Stat Softw 11(8):1–18

    Google Scholar 

  • Lindsay B (1995) Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward

  • Martin JA, Hamilton BE, Sutton PD, Ventura SJ, Menacker F, Kirmeyer S, Mathews T (2009) Births: final data for 2006. Natl Vital Stat Rep 57(7):1–104

    Google Scholar 

  • Maulik U, Bandyopadhyay S, Mukhopadhyay A (2011) Multiobjective genetic algorithm-based fuzzy clustering: applications in data mining and bioinformatics. Springer, Berlin

    Book  Google Scholar 

  • Mayrose I, Friedman N, Pupko T (2005) A gamma mixture model better accounts for among site rate heterogeneity. Bioinformatics 21(2):151–158

    Google Scholar 

  • Mazza A, Punzo A (2011) Discrete beta kernel graduation of age-specific demographic indicators. In: Ingrassia S, Rocci R, Vichi M (eds) New perspectives in statistical modeling and data analysis (Studies in classification, data analysis and knowledge organization), vol 42. Springer, Berlin, pp 127–134

  • Mazza A, Punzo A (2013a) Graduation by adaptive discrete beta kernels. In: Giusti A, Ritter G, Vichi M (eds) Classification and data mining (Studies in classification, data analysis and knowledge organization), vol 44. Springer, Berlin, pp 77–84

  • Mazza A, Punzo A (2013b) Using the variation coefficient for adaptive discrete beta kernel graduation. In: Giudici P, Ingrassia S, Vichi M (eds) Studies in classification, data analysis and knowledge organization. Springer, Berlin (in press)

  • McLachlan G, Krishnan T (2007) The EM algorithm and extensions. Wiley, New York

    Google Scholar 

  • McLachlan GJ, Basford KE (1988) Mixture models—inference and applications to clustering. Marcel Dekker, New York

    MATH  Google Scholar 

  • McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  • Meilă M, Heckerman D (2001) An experimental comparison of model-based clustering methods. Mach Learn 42(1):9–29

    Article  MATH  Google Scholar 

  • Murphy EA (1964) One cause? Many causes? the argument from the bimodal distribution. J Chronic Dis 17(4):301–324

    Article  Google Scholar 

  • Pearson K (1902a) On the systematic fitting of curves to observations and measurements. Biometrika 1(3):265–303

    Article  Google Scholar 

  • Pearson K (1902b) On the systematic fitting of curves to observations and measurements: part II. Biometrika 2(1):1–23

    Google Scholar 

  • Petrone S (1999a) Bayesian density estimation using Bernstein polynomials. Can J Stat 27(1):105–126

    Article  MathSciNet  MATH  Google Scholar 

  • Petrone S (1999b) Random Bernstein polynomials. Scand J Stat 26(3):373–393

    Article  MathSciNet  MATH  Google Scholar 

  • Punzo A (2010) Discrete beta-type models. In: Locarek-Junge H, Weihs C (eds) Classification as a tool for research (Studies in classification, data analysis and knowledge organization), vol 40. Springer, Berlin, pp 253–261

  • Punzo A, Zini A (2012) Discrete approximations of continuous and mixed measures on a compact interval. Stat Pap 53(3):563–575

    Article  MathSciNet  MATH  Google Scholar 

  • Ray S, Lindsay B (2005) The topography of multivariate normal mixtures. Ann Stat 33(5):2042–2065

    Article  MathSciNet  MATH  Google Scholar 

  • R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/, ISBN 3-900051-07-0

  • Redner RA, Walker HF (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2):195–239

    Article  MathSciNet  MATH  Google Scholar 

  • Robertson C, Fryer J (1969) Some descriptive properties of normal mixtures. Skand Aktuarietidskr 52: 137–146

    Google Scholar 

  • Rogers A (1986) Parameterized multistate population dynamics and projections. J Am Stat Assoc 81(393):48–61

    Article  Google Scholar 

  • Scharl T, Grün B, Leisch F (2010) Mixtures of regression models for time course gene expression data: evaluation of initialization and random effects. Bioinformatics 26(3):370–377

    Article  Google Scholar 

  • Schilling M, Watkins A, Watkins W (2002) Is human height bimodal? Am Stat 56(3):223–229

    Article  MathSciNet  Google Scholar 

  • Silverman B (1981) Using kernel density estimates to investigate multimodality. J R Stat Soc Ser B Methodol 43:97–99

    Google Scholar 

  • Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, New York

    MATH  Google Scholar 

  • Wessels J (1964) Multimodality in a family of probability densities, with application to a linear mixture of two normal densities. Statistica Neerlandica 18(3):267–282

    Article  MathSciNet  Google Scholar 

  • Wiper M, Insua DR, Ruggeri F (2001) Mixtures of gamma distributions with applications. J Comput Graph Stat 10(3):440–454

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Punzo.

Electronic supplementary material

Appendices

Parameterization genesis

If a density function \(f\) is chosen to belong to the Pearson system, then it is the solution of the differential equation

$$\begin{aligned} \frac{df\left(x\right)}{dx}=-\frac{\left(x-m\right)f\left(x\right)}{c_0+c_1x+c_2x^2}. \end{aligned}$$
(18)

It is clear that some of the solutions to (18) have a single mode (\(df/dx=0\) at \(x=m\)) and smooth contact with the horizontal axis (\(df/dx=0\) when \(f\left(x\right)=0\)).

The shape of \(f\) depends on the set of parameters \(\left(m,c_0,c_1,c_2\right)\), and the form of the solution of (18) evidently depends on the nature of the roots of the equation

$$\begin{aligned} c_0+c_1x+c_2x^2=0. \end{aligned}$$
(19)

The classes of unimodal gamma and beta densities, illustrated in Sect. 2, arise from a convenient choice of these roots.

1.1 Unimodal gamma densities

Consider \(c_2=0\) and \(c_1>0\). Thus, Eq. (18) becomes

$$\begin{aligned} \frac{df\left(x\right)}{dx}=-\frac{\left(x-m\right)f\left(x\right)}{c_0+c_1x}=\left(-\frac{1}{c_1}+\frac{m+c_0/c_1}{c_0+c_1x}\right)f\left(x\right), \end{aligned}$$
(20)

in which

$$\begin{aligned} f\left(x\right)=C\left(x+\frac{c_0}{c_1}\right)^{\frac{1}{c_1}\left(m+\frac{c_0}{c_1}\right)}e^{-\frac{x}{c_1}},\qquad -\frac{c_0}{c_1}\le x<\infty . \end{aligned}$$
(21)

In order to make (21) a density function,

$$\begin{aligned} C=\left\{ c_1^{\frac{1}{c_1}\left(m+\frac{c_0}{c_1}\right)+1}e^{\frac{c_0}{c_1^2}}\mathrm \Gamma \left[\frac{1}{c_1}\left(m+\frac{c_0}{c_1}\right)+1\right]\right\} ^{-1}, \end{aligned}$$

so that the result is a gamma distribution (see Johnson and Kotz 1970a, Chapter 17). Equation (2) is obtained by setting \(-c_0/c_1=a\) and \(c_1=v\).

1.2 Unimodal beta densities

Suppose that both the roots of (19) are real. Denoting these roots as \(a\) and \(b\), with \(a<b\), it follows that

$$\begin{aligned} c_0+c_1x+c_2x^2=-c_2\left(x-a\right)\left(b-x\right); \end{aligned}$$

consequently, Eq. (18) becomes

$$\begin{aligned} \frac{df\left(x\right)}{dx}=\frac{\left(x-m\right)f\left(x\right)}{c_2\left(x-a\right)\left(b-x\right)}=\frac{1}{c_2\left(b-a\right)}\left(\frac{a-m}{x-a}+\frac{b-m}{b-x}\right)f\left(x\right), \end{aligned}$$
(22)

in which

$$\begin{aligned} f\left(x\right)=C\left(x-a\right)^{\frac{a-m}{c_2\left(b-a\right)}}\left(b-x\right)^{\frac{m-b}{c_2\left(b-a\right)}},\qquad a\le x\le b. \end{aligned}$$
(23)

In order to make (23) a density function,

$$\begin{aligned} C=\left\{ \left(b-a\right)^{\frac{c_2-1}{c_2}}\mathrm B \left[\frac{a-m}{c_2\left(b-a\right)}+1,\frac{m-b}{c_2\left(b-a\right)}+1\right]\right\} ^{-1}, \end{aligned}$$

so that the result is a beta distribution (see Johnson and Kotz 1970b, Chapter 24). If a unimodal beta density must be considered, it is necessary that \(c_2\le 0\). Equation (6) is obtained by setting \(-c_2=v\) in (23).

Details on the EM algorithm

Here we attempt to make explicit the derivatives in (14) for both gamma and beta densities parameterized according to (2) and (6), respectively. We recall that the resulting ML-estimates do not have a closed-form expression and can only be computed numerically, with the aid of an iterative algorithm; such numerical methods are available in most computer software, such as Mathematica and R.

In detail, for the gamma density in (2) we have

$$\begin{aligned} \displaystyle \frac{\partial \ln f \left(x_i;m_j,v_j\right)}{\partial m_j} = \displaystyle \frac{1}{v_j}\left[\ln \left(x_i-a\right)-\ln v_j-\psi \left(\frac{m_j-a}{v_j}+1\right)\right] \end{aligned}$$

and

$$\begin{aligned} \frac{\partial \ln f \left(x_i;m_j,v_j\right)}{\partial v_j}&= \frac{1}{v_j^2}\left\{ \left(m_j-a\right)\left[\ln v_j+\psi \left(\displaystyle \frac{m_j-a}{v_j}+1\right)-\ln \left(x_i-a\right)\right]+\right.\\&-\left(m_j+v_j\right)+x_i\biggr \}, \end{aligned}$$

where \(\psi \left(\cdot \right)\) is the digamma function. In the same way, for the beta density in (6) we have

$$\begin{aligned} \frac{\partial \ln f \left(x_i;m_j,v_j\right)}{\partial m_j}&= \displaystyle \frac{1}{v_j\left(b-a\right)}\left\{ \left[\psi \left(\frac{b-m_j}{v_j\left(b-a\right)}+1\right)-\psi \left(\frac{m_j-a}{v_j\left(b-a\right)}+1\right)\right]+\right.\\&\quad +\ln \left(x_i-a\right)-\ln \left(b-x_i\right)\biggr \}, \end{aligned}$$

and

$$\begin{aligned} \frac{\partial \ln f\left(x_i;m_j,v_j\right)}{\partial v_j}&= \frac{1}{v_j^2\left(b-a\right)}\left\{ \left(b-a\right)\left[\ln \left(b-a\right)-\psi \left(\displaystyle \frac{2v_j+1}{v_j}\right)\right]+\right.\\&\quad +\left[\left(m_j-a\right)\psi \quad \left(\frac{m_j-a}{v_j\left(b-a\right)}+1\right)+\left(b-m_j\right)\psi \left(\displaystyle \frac{b-m_j}{v_j\left(b-a\right)}+1\right)\right]+\\&\quad -\left(m_j-a\right)\ln \left(x_i-a\right)-\left(b-m_j\right)\ln \left(b-x_i\right)\biggr \}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bagnato, L., Punzo, A. Finite mixtures of unimodal beta and gamma densities and the \(k\)-bumps algorithm. Comput Stat 28, 1571–1597 (2013). https://doi.org/10.1007/s00180-012-0367-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-012-0367-4

Keywords