Abstract
In this manuscript, we consider a finite multivariate nonparametric mixture model where the dependence between the marginal densities is modeled using the copula device. Pseudo expectation–maximization (EM) stochastic algorithms were recently proposed to estimate all of the components of this model under a location-scale constraint on the marginals. Here, we introduce a deterministic algorithm that seeks to maximize a smoothed semiparametric likelihood. No location-scale assumption is made about the marginals. The algorithm is monotonic in one special case, and, in another, leads to “approximate monotonicity”—whereby the difference between successive values of the objective function becomes non-negative up to an additive term that becomes negligible after a sufficiently large number of iterations. The behavior of this algorithm is illustrated on several simulated and real datasets. The results suggest that, under suitable conditions, the proposed algorithm may indeed be monotonic in general. A discussion of the results and some possible future research directions round out our presentation.
Similar content being viewed by others
References
Allman ES, Matias C, Rhodes JA (2009) Identifiability of parameters in latent structure models with many observed variables. Ann Stat 37(6A):3099–3132
Benaglia T, Chauveau D, Hunter DR (2009) An EM-like algorithm for semi-and nonparametric estimation in multivariate mixtures. J Comput Graph Stat 18(2):505–526
Bonhomme S, Jochmans K, Robin JM (2016) Non-parametric estimation of finite mixtures from repeated measurements. J R Stat Soc Ser B (Stat Methodol) 78(1):211–229
Bouveyron C, Celeux G, Murphy TB et al (2019) Model-based clustering and classification for data science: with applications in R, vol 50. Cambridge University Press, London
Brezis H (2011) Functional analysis, Sobolev spaces and partial differential equations. Springer, Berlin
Hall P, Zhou XH (2003) Nonparametric estimation of component distributions in a multivariate mixture. Ann Stat 31(1):201–224
Kasahara H, Shimotsu K (2014) Non-parametric identification and estimation of the number of components in multivariate mixtures. J R Stat Soc Ser B (Stat Methodol) 76(1):97–111
Kwon C, Mbakop E (2021) Estimation of the number of components of nonparametric multivariate finite mixture models. Ann Stat 49(4):2178–2205
Levine M, Hunter DR, Chauveau D (2011) Maximum smoothed likelihood for multivariate mixtures. Biometrika 98(2):403–416
Mazo G (2017) A semiparametric and location-shift copula-based mixture model. J Classif 34(3):444–464
Mazo G, Averyanov Y (2019) Constraining kernel estimators in semiparametric copula mixture models. Comput Stat Data Anal 138:170–189
McNicholas PD (2016) Mixture model-based classification. CRC Press, Boca Raton
Meyer RR (1976) Sufficient conditions for the convergence of monotonic mathematical programming algorithms. J Comput Syst Sci 12:108–121
Nelsen RB (2007) An introduction to copulas. Springer, Berlin
Qiang J (2010) A high-order fast method for computing convolution integral with smooth kernel. Comput Phys Commun 181(2):313–316
Rau A, Maugis-Rabusseau C, Martin-Magniette ML et al (2015) Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics 31(9):1420–1427
Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New York
Silverman BW (1998) Density estimation for statistics and data analysis. Chapman & Hall, New York
Vrac M, Billard L, Diday E et al (2012) Copula analysis of mixture models. Comput Stat 27:427–457
Wu TT, Lange K (2010) The MM alternative to EM. Stat Sci 25(4):492–505
Xiang S, Yao W, Yang G (2019) An overview of semiparametric extensions of finite mixture models. Stat Sci 34(3):391–404
Zangwill WI (1969) Nonlinear programming-a unified approach. Prentice-Hall, New York
Acknowledgements
Michael Levine’s research has been partially funded by the NSF-DMS Grant # 2311103. We thank two anonymous reviewers for helpful comments that led to an improved version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Michael Levine and Gildas Mazo have equally contributed this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Levine, M., Mazo, G. A smoothed semiparametric likelihood for estimation of nonparametric finite mixture models with a copula-based dependence structure. Comput Stat 39, 1825–1846 (2024). https://doi.org/10.1007/s00180-024-01483-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-024-01483-4