Finite mixtures of unimodal beta and gamma densities and the $$k$$ -bumps algorithm

Bagnato, Luca; Punzo, Antonio

doi:10.1007/s00180-012-0367-4

Finite mixtures of unimodal beta and gamma densities and the $k$-bumps algorithm

Original Paper
Published: 07 October 2012

Volume 28, pages 1571–1597, (2013)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Luca Bagnato¹ &
Antonio Punzo²

783 Accesses
54 Citations
Explore all metrics

Abstract

This paper addresses the problem of estimating a density, with either a compact support or a support bounded at only one end, exploiting a general and natural form of a finite mixture of distributions. Due to the importance of the concept of multimodality in the mixture framework, unimodal beta and gamma densities are used as mixture components, leading to a flexible modeling approach. Accordingly, a mode-based parameterization of the components is provided. A partitional clustering method, named $k$-bumps, is also proposed; it is used as an ad hoc initialization strategy in the EM algorithm to obtain the maximum likelihood estimation of the mixture parameters. The performance of the $k$-bumps algorithm as an initialization tool, in comparison to other common initialization strategies, is evaluated through some simulation experiments. Finally, two real applications are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Multivariate Beta Mixture Model: Probabilistic Clustering with Flexible Cluster Shapes

From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

Article Open access 24 August 2018

Finite Mixture of Birnbaum–Saunders Distributions Using the k-Bumps Algorithm

Article 09 March 2022

Notes

Downloadable from http://www.humanfertility.org/cgi-bin/main.php.

References

Altman E, Resti A, Sironi A (2005) Loss given default: a review of the literature. In: Altman E, Resti A, Sironi A (eds) The next challenge in credit risk management. Riskbooks, London
Google Scholar
Banca d’Italia (2001) Principali Risultati della Rilevazione sull’Attività di Recupero dei Crediti. Bollettino di Vigilanza 12
Basel Committee on Banking Supervision (2004) International capital measurement and capital standards: a revised framework. Bank for International Settlements, Basel
Google Scholar
Behboodian J (1970) On the modes of a mixture of two normal distributions. Technometrics 12(1):131–139
Article MATH Google Scholar
Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3):561–575
Article MathSciNet Google Scholar
Brazier S, Sparks RSJ, Carey SN, Sigurdsson H, Westgate JA (1983) Bimodal grain size distribution and secondary thickening in air-fall ash layers. Nature 301:115–119
Article Google Scholar
Bruche M, González-Aguado C (2010) Recovery rates, default probabilities, and the credit cycle. J Banking Financ 34(4):713–723
Article Google Scholar
Calabrese R, Zenga M (2008) Measuring loan recovery rate: methodology and empirical evidence. Stat Appl VI(2):193–214
Google Scholar
Calabrese R, Zenga M (2010) Bank loan recovery rates: measuring and nonparametric density estimation. J Banking Financ 34(5):903–911
Article Google Scholar
Celeux G, Govaert G (1992) A classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14(3):315–332
Article MathSciNet MATH Google Scholar
Chen S (1999) Beta kernel estimators for density functions. Comput Stat Data Anal 31(2):131–145
Article MATH Google Scholar
Chen S (2000) Probability density function estimation using gamma kernels. Ann Inst Stat Math 52(3):471–480
Article MATH Google Scholar
Coale A (1971) Age patterns of marriage. Pop Stud 25(2):193–214
Google Scholar
Congdon P (1993) Statistical graduation in local demographic analysis and projection. J R Stat Soc Ser A Stat Soc 156(2):237–270
Article Google Scholar
Cox D (1966) Notes on the analysis of mixed frequency distributions. Br J Math Stat Psychol 19(1):39–47
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B Methodol 39(1):1–38
MathSciNet MATH Google Scholar
Diebolt J, Ip E (1996) Stochastic EM: method and application. In: Gilks W, Richardson S, Spiegelhalter D (eds) Markov chain Monte Carlo in practice, chap 15. Chapman and Hall, London, pp 259–273
Google Scholar
Dye JL, (2008) Fertility of American women, 2006. Current Population Reports, US Census Bureau 20(558)
Eisenberger I (1964) Genesis of bimodal distributions. Technometrics 6(4):357–363
Article MathSciNet Google Scholar
Elderton WP, Johnson NL (1969) Systems of frequency curves. Cambridge University Press, Cambridge
Book MATH Google Scholar
Everitt B, Hand DJ (1981) Finite mixture distributions. Chapman and Hall, London
Book MATH Google Scholar
Ghosal S (2001) Convergence rates for density estimation with Bernstein polynomials. Ann Stat 29(5):1264–1280
Article MathSciNet MATH Google Scholar
Gupton G, Stein R (2002) LossCalc: Moody’s model for predicting loss given default (LGD). Moody’s Investors Service, New York
Google Scholar
Gupton G, Finger C, Bhatia M (1997) CreditMetrics—technical document. J. P. Morgan and Co, New York
Google Scholar
Huang Z (1998) Extensions to the $k$-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304
Article Google Scholar
Izenman AJ (2008) Modern multivariate statistical techniques: regression, classification, and manifold Learning. Springer, New York
Book Google Scholar
Ji Y, Wu C, Liu P, Wang J, Coombes K (2005) Applications of beta-mixture models in bioinformatics. Bioinformatics 21(9):2118–2122
Article Google Scholar
Johnson NL, Kotz S (1970a) Continuous univariate distributions, vol 1. Wiley, New York
MATH Google Scholar
Johnson NL, Kotz S (1970b) Continuous univariate distributions, vol 2. Wiley, New York
MATH Google Scholar
Jordan MI, Xu L (1995) Convergence results for the EM approach to mixtures of experts architectures. Neural Netw 8(9):1409–1431
Article Google Scholar
Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis, vol 39. Wiley, New York
Book Google Scholar
Kendall MG, Stuart A (1958) The advanced theory of statistics, vol 1. Charles Griffin and Company Limited, London
Google Scholar
Lee S, Sheldon Lin X (2010) Modeling and evaluating insurance losses via mixtures of Erlang distributions. N Am Actuar J 14(1):107–130
Article MathSciNet Google Scholar
Leisch F (2004) FlexMix: a general framework for finite mixture models and latent class regression in R. J Stat Softw 11(8):1–18
Google Scholar
Lindsay B (1995) Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward
Martin JA, Hamilton BE, Sutton PD, Ventura SJ, Menacker F, Kirmeyer S, Mathews T (2009) Births: final data for 2006. Natl Vital Stat Rep 57(7):1–104
Google Scholar
Maulik U, Bandyopadhyay S, Mukhopadhyay A (2011) Multiobjective genetic algorithm-based fuzzy clustering: applications in data mining and bioinformatics. Springer, Berlin
Book Google Scholar
Mayrose I, Friedman N, Pupko T (2005) A gamma mixture model better accounts for among site rate heterogeneity. Bioinformatics 21(2):151–158
Google Scholar
Mazza A, Punzo A (2011) Discrete beta kernel graduation of age-specific demographic indicators. In: Ingrassia S, Rocci R, Vichi M (eds) New perspectives in statistical modeling and data analysis (Studies in classification, data analysis and knowledge organization), vol 42. Springer, Berlin, pp 127–134
Mazza A, Punzo A (2013a) Graduation by adaptive discrete beta kernels. In: Giusti A, Ritter G, Vichi M (eds) Classification and data mining (Studies in classification, data analysis and knowledge organization), vol 44. Springer, Berlin, pp 77–84
Mazza A, Punzo A (2013b) Using the variation coefficient for adaptive discrete beta kernel graduation. In: Giudici P, Ingrassia S, Vichi M (eds) Studies in classification, data analysis and knowledge organization. Springer, Berlin (in press)
McLachlan G, Krishnan T (2007) The EM algorithm and extensions. Wiley, New York
Google Scholar
McLachlan GJ, Basford KE (1988) Mixture models—inference and applications to clustering. Marcel Dekker, New York
MATH Google Scholar
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Book MATH Google Scholar
Meilă M, Heckerman D (2001) An experimental comparison of model-based clustering methods. Mach Learn 42(1):9–29
Article MATH Google Scholar
Murphy EA (1964) One cause? Many causes? the argument from the bimodal distribution. J Chronic Dis 17(4):301–324
Article Google Scholar
Pearson K (1902a) On the systematic fitting of curves to observations and measurements. Biometrika 1(3):265–303
Article Google Scholar
Pearson K (1902b) On the systematic fitting of curves to observations and measurements: part II. Biometrika 2(1):1–23
Google Scholar
Petrone S (1999a) Bayesian density estimation using Bernstein polynomials. Can J Stat 27(1):105–126
Article MathSciNet MATH Google Scholar
Petrone S (1999b) Random Bernstein polynomials. Scand J Stat 26(3):373–393
Article MathSciNet MATH Google Scholar
Punzo A (2010) Discrete beta-type models. In: Locarek-Junge H, Weihs C (eds) Classification as a tool for research (Studies in classification, data analysis and knowledge organization), vol 40. Springer, Berlin, pp 253–261
Punzo A, Zini A (2012) Discrete approximations of continuous and mixed measures on a compact interval. Stat Pap 53(3):563–575
Article MathSciNet MATH Google Scholar
Ray S, Lindsay B (2005) The topography of multivariate normal mixtures. Ann Stat 33(5):2042–2065
Article MathSciNet MATH Google Scholar
R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/, ISBN 3-900051-07-0
Redner RA, Walker HF (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2):195–239
Article MathSciNet MATH Google Scholar
Robertson C, Fryer J (1969) Some descriptive properties of normal mixtures. Skand Aktuarietidskr 52: 137–146
Google Scholar
Rogers A (1986) Parameterized multistate population dynamics and projections. J Am Stat Assoc 81(393):48–61
Article Google Scholar
Scharl T, Grün B, Leisch F (2010) Mixtures of regression models for time course gene expression data: evaluation of initialization and random effects. Bioinformatics 26(3):370–377
Article Google Scholar
Schilling M, Watkins A, Watkins W (2002) Is human height bimodal? Am Stat 56(3):223–229
Article MathSciNet Google Scholar
Silverman B (1981) Using kernel density estimates to investigate multimodality. J R Stat Soc Ser B Methodol 43:97–99
Google Scholar
Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, New York
MATH Google Scholar
Wessels J (1964) Multimodality in a family of probability densities, with application to a linear mixture of two normal densities. Statistica Neerlandica 18(3):267–282
Article MathSciNet Google Scholar
Wiper M, Insua DR, Ruggeri F (2001) Mixtures of gamma distributions with applications. J Comput Graph Stat 10(3):440–454
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Metodi Quantitativi per le Scienze Economiche ed Aziendali, Università di Milano-Bicocca, Milan, Italy
Luca Bagnato
Dipartimento di Economia e Impresa, Università di Catania, Catania, Italy
Antonio Punzo

Authors

Luca Bagnato
View author publications
You can also search for this author inPubMed Google Scholar
Antonio Punzo
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Antonio Punzo.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 17 KB)

Supplementary material 2 (r 2 KB)

Supplementary material 3 (R 16 KB)

Supplementary material 4 (r 2 KB)

Appendices

Parameterization genesis

If a density function $f$ is chosen to belong to the Pearson system, then it is the solution of the differential equation

$$\begin{aligned} \frac{df\left(x\right)}{dx}=-\frac{\left(x-m\right)f\left(x\right)}{c_0+c_1x+c_2x^2}. \end{aligned}$$

(18)

It is clear that some of the solutions to (18) have a single mode ($df/dx=0$ at $x=m$) and smooth contact with the horizontal axis ($df/dx=0$ when $f\left(x\right)=0$).

The shape of $f$ depends on the set of parameters $\left(m,c_0,c_1,c_2\right)$, and the form of the solution of (18) evidently depends on the nature of the roots of the equation

$$\begin{aligned} c_0+c_1x+c_2x^2=0. \end{aligned}$$

(19)

The classes of unimodal gamma and beta densities, illustrated in Sect. 2, arise from a convenient choice of these roots.

1.1 Unimodal gamma densities

Consider $c_2=0$ and $c_1>0$. Thus, Eq. (18) becomes

$$\begin{aligned} \frac{df\left(x\right)}{dx}=-\frac{\left(x-m\right)f\left(x\right)}{c_0+c_1x}=\left(-\frac{1}{c_1}+\frac{m+c_0/c_1}{c_0+c_1x}\right)f\left(x\right), \end{aligned}$$

(20)

in which

$$\begin{aligned} f\left(x\right)=C\left(x+\frac{c_0}{c_1}\right)^{\frac{1}{c_1}\left(m+\frac{c_0}{c_1}\right)}e^{-\frac{x}{c_1}},\qquad -\frac{c_0}{c_1}\le x<\infty . \end{aligned}$$

(21)

In order to make (21) a density function,

$$\begin{aligned} C=\left\{ c_1^{\frac{1}{c_1}\left(m+\frac{c_0}{c_1}\right)+1}e^{\frac{c_0}{c_1^2}}\mathrm \Gamma \left[\frac{1}{c_1}\left(m+\frac{c_0}{c_1}\right)+1\right]\right\} ^{-1}, \end{aligned}$$

so that the result is a gamma distribution (see Johnson and Kotz 1970a, Chapter 17). Equation (2) is obtained by setting $-c_0/c_1=a$ and $c_1=v$.

1.2 Unimodal beta densities

Suppose that both the roots of (19) are real. Denoting these roots as $a$ and $b$, with $a<b$, it follows that

$$\begin{aligned} c_0+c_1x+c_2x^2=-c_2\left(x-a\right)\left(b-x\right); \end{aligned}$$

consequently, Eq. (18) becomes

$$\begin{aligned} \frac{df\left(x\right)}{dx}=\frac{\left(x-m\right)f\left(x\right)}{c_2\left(x-a\right)\left(b-x\right)}=\frac{1}{c_2\left(b-a\right)}\left(\frac{a-m}{x-a}+\frac{b-m}{b-x}\right)f\left(x\right), \end{aligned}$$

(22)

in which

$$\begin{aligned} f\left(x\right)=C\left(x-a\right)^{\frac{a-m}{c_2\left(b-a\right)}}\left(b-x\right)^{\frac{m-b}{c_2\left(b-a\right)}},\qquad a\le x\le b. \end{aligned}$$

(23)

In order to make (23) a density function,

$$\begin{aligned} C=\left\{ \left(b-a\right)^{\frac{c_2-1}{c_2}}\mathrm B \left[\frac{a-m}{c_2\left(b-a\right)}+1,\frac{m-b}{c_2\left(b-a\right)}+1\right]\right\} ^{-1}, \end{aligned}$$

so that the result is a beta distribution (see Johnson and Kotz 1970b, Chapter 24). If a unimodal beta density must be considered, it is necessary that $c_2\le 0$. Equation (6) is obtained by setting $-c_2=v$ in (23).

Details on the EM algorithm

Here we attempt to make explicit the derivatives in (14) for both gamma and beta densities parameterized according to (2) and (6), respectively. We recall that the resulting ML-estimates do not have a closed-form expression and can only be computed numerically, with the aid of an iterative algorithm; such numerical methods are available in most computer software, such as Mathematica and R.

In detail, for the gamma density in (2) we have

$$\begin{aligned} \displaystyle \frac{\partial \ln f \left(x_i;m_j,v_j\right)}{\partial m_j} = \displaystyle \frac{1}{v_j}\left[\ln \left(x_i-a\right)-\ln v_j-\psi \left(\frac{m_j-a}{v_j}+1\right)\right] \end{aligned}$$

and

$$\begin{aligned} \frac{\partial \ln f \left(x_i;m_j,v_j\right)}{\partial v_j}&= \frac{1}{v_j^2}\left\{ \left(m_j-a\right)\left[\ln v_j+\psi \left(\displaystyle \frac{m_j-a}{v_j}+1\right)-\ln \left(x_i-a\right)\right]+\right.\\&-\left(m_j+v_j\right)+x_i\biggr \}, \end{aligned}$$

where $\psi \left(\cdot \right)$ is the digamma function. In the same way, for the beta density in (6) we have

$$\begin{aligned} \frac{\partial \ln f \left(x_i;m_j,v_j\right)}{\partial m_j}&= \displaystyle \frac{1}{v_j\left(b-a\right)}\left\{ \left[\psi \left(\frac{b-m_j}{v_j\left(b-a\right)}+1\right)-\psi \left(\frac{m_j-a}{v_j\left(b-a\right)}+1\right)\right]+\right.\\&\quad +\ln \left(x_i-a\right)-\ln \left(b-x_i\right)\biggr \}, \end{aligned}$$

and

$$\begin{aligned} \frac{\partial \ln f\left(x_i;m_j,v_j\right)}{\partial v_j}&= \frac{1}{v_j^2\left(b-a\right)}\left\{ \left(b-a\right)\left[\ln \left(b-a\right)-\psi \left(\displaystyle \frac{2v_j+1}{v_j}\right)\right]+\right.\\&\quad +\left[\left(m_j-a\right)\psi \quad \left(\frac{m_j-a}{v_j\left(b-a\right)}+1\right)+\left(b-m_j\right)\psi \left(\displaystyle \frac{b-m_j}{v_j\left(b-a\right)}+1\right)\right]+\\&\quad -\left(m_j-a\right)\ln \left(x_i-a\right)-\left(b-m_j\right)\ln \left(b-x_i\right)\biggr \}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bagnato, L., Punzo, A. Finite mixtures of unimodal beta and gamma densities and the $k$-bumps algorithm. Comput Stat 28, 1571–1597 (2013). https://doi.org/10.1007/s00180-012-0367-4

Download citation

Received: 23 November 2010
Accepted: 10 September 2012
Published: 07 October 2012
Issue Date: August 2013
DOI: https://doi.org/10.1007/s00180-012-0367-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Finite mixtures of unimodal beta and gamma densities and the \(k\)-bumps algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multivariate Beta Mixture Model: Probabilistic Clustering with Flexible Cluster Shapes

From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

Finite Mixture of Birnbaum–Saunders Distributions Using the k-Bumps Algorithm

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (R 17 KB)

Supplementary material 2 (r 2 KB)

Supplementary material 3 (R 16 KB)

Supplementary material 4 (r 2 KB)

Appendices

Parameterization genesis

1.1 Unimodal gamma densities

1.2 Unimodal beta densities

Details on the EM algorithm

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Finite mixtures of unimodal beta and gamma densities and the \(k\)-bumps algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multivariate Beta Mixture Model: Probabilistic Clustering with Flexible Cluster Shapes

From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

Finite Mixture of Birnbaum–Saunders Distributions Using the k-Bumps Algorithm

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (R 17 KB)

Supplementary material 2 (r 2 KB)

Supplementary material 3 (R 16 KB)

Supplementary material 4 (r 2 KB)

Appendices

Parameterization genesis

1.1 Unimodal gamma densities

1.2 Unimodal beta densities

Details on the EM algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now