Abstract
The mixture of factor analyzers model, which has been used successfully for the model-based clustering of high-dimensional data, is extended to generalized hyperbolic mixtures. The development of a mixture of generalized hyperbolic factor analyzers is outlined, drawing upon the relationship with the generalized inverse Gaussian distribution. An alternating expectation-conditional maximization algorithm is used for parameter estimation, and the Bayesian information criterion is used to select the number of factors as well as the number of components. The performance of our generalized hyperbolic factor analyzers model is illustrated on real and simulated data, where it performs favourably compared to its Gaussian analogue and other approaches.
Similar content being viewed by others
References
Aitken A (1926) On Bernoulli’s numerical solution of algebraic equations. Proc R Soc Edim 46:289–305
Andrews JL, McNicholas PD (2011a) Extending mixtures of multivariate t-factor analyzers. Stat Comput 21(3):361–373
Andrews JL, McNicholas PD (2011b) Mixtures of modified t-factor analyzers for model-based clustering, classification, and discriminant analysis. J Stat Plan Inference 141(4):1479–1486
Andrews JL, McNicholas P (2012) Model-based clustering, classification, and discriminant analysis via mixtures of multivariate \(t\)-distributions. Stat Comput 22(5):1021–1029
Baek J, McLachlan GJM, Flack L (2010) Mixtures of factor analyzers with common factor loadings: Applications to the clustering and visualization of high-dimensional data. IEEE Trans Pattern Anal Mach Intell 32(7):1298–1309
Barndorff-Nielsen O, Halgreen C (1977) Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions. Z. Wahrscheinlichkeitstheor Verw. Geb 38:309–311
Bergé L, Bouveyron C, Girard S (2013) Hdclassif: high dimensional supervised classification and clustering. R Package Version 1(2):2
Bhattacharya S, McNicholas PD (2014) A LASSO-penalized BIC for mixture model selection. Adv Data Anal Classif 8(1):45–61
Blæsild P (1978) The shape of the generalized inverse Gaussian and hyperbolic distributions. In: Research Report 37, Department of Theoretical Statistics. Aarhus University, Denmark
Böhning D, Diez E, Scheub R, Schlattmann P, Lindsay B (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46:373–388
Bouveyron C, Girard S, Schmid C (2007) High-dimensional data clustering. Comput Stat Data Anal 52(1):502–519
Bouveyron C, Brunet-Saumard C (2014) Model-based clustering of high-dimensional data: a review. Comput Stat Data Anal 71:52–78
Browne RP, McNicholas PD (2015) A mixture of generalized hyperbolic distributions. Can J Stat. doi:10.1002/cjs.11246
Browne RP, McNicholas PD, Sparling MD (2012) Model-based learning using a mixture of mixtures of Gaussian and uniform distributions. IEEE Trans Pattern Anal Mach Intell 34(4):814–817
Browne RP, McNicholas PD (2014) Estimating common principal components in high dimensions. Adv Data Anal Classif 8(2):217–226
Campbell JG, Fraley F, Murtagh F, Raftery AE (1997) Linear flaw detection in woven textiles using model-based clustering. Pattern Recogn Lett 18:1539–1548
Chen X, Cheung ST, So S, Fan ST, Barry C, Higgins J, Lai K-M, Ji J, Dudoit S, Ng IO, van de Rijn M, Botstein D, Brown PO (2002) Gene expression patterns in human liver cancers. Mol Biol Cell 13(6):1929–1939
Dasgupta A, Raftery AE (1998) Detecting features in spatial point processed with clutter via model-based clustering. J Am Stat Assoc 93:294–302
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1–38
Forina M, Armanino C (1982) Eigenvector projection and simplified non linear mapping of fatty acid content of Italian olive oils. Ann Chim 72:127–141
Forina M, Tiscornia E (1982) Pattern recognition methods in the prediction of Italian olive oil origin by their fatty acid content. Ann Chim 72:143–155
Forina M, Armanino C, Castino M, Ubigli M (1986) Multivariate data analysis as a discriminating method of the origin of wines. Vitis 25:189–201
Franczak BC, McNicholas PD, Browne RP, Murray PM (2013) Parsimonious shifted asymmetric Laplace mixtures. ArXiv preprint arXiv:1311.0317
Franczak BC, Browne RP, McNicholas PD (2014) Mixtures of shifted asymmetric Laplace distributions. IEEE Trans Pattern Anal Mach Intell 36(6):1149–1157
Ghahramani Z, Hinton GE (1997) The EM algorithm for factor analyzers. In: Technical Report CRG-TR-96-1. University of Toronto, Toronto
Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40:237–260
Gorman RP, Sejnowski TJ (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1(1):75–89
Halgreen C (1979) Self-decomposibility of the generalized inverse Gaussian and hyperbolic distributions. Z. Wahrscheinlichkeitstheor Verw. Geb 47:13–18
Hennig C (2010) Methods for merging Gaussian mixture components. Adv Data Anal Classif 4:3–34
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Jørgensen B (1982) Statistical properties of the generalized inverse Gaussian distribution. Springer, New York
Karlis D, Santourian A (2009) Model-based clustering with non-elliptically contoured distributions. Stat Comput 19(1):73–83
Lee SX, McLachlan GJ (2013b) On mixtures of skew normal and skew t-distributions. Adv Data Anal Classif 7(3):241–266
Lee S, McLachlan G (2013a). EMMIXuskew: fitting unrestricted multivariate skew t mixture models. R package version 0.11-5
Lin T-I, McLachlan GJ, Lee SX (2013) Extending mixtures of factor models using the restricted multivariate skew-normal distribution. ArXiv preprint arXiv:1307.1748
Lin T-I (2009) Maximum likelihood estimation for multivariate skew normal mixture models. J Multivar Anal 100:257–265
Lin T-I (2010) Robust mixture modeling using multivariate skew t distributions. Stat Comput 20(3):343–356
Lin T-I, McNicholas PD, Hsiu JH (2014) Capturing patterns via parsimonious t mixture models. Stat Probab Lett 88:80–87
Lindsay B (1995). Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward, California
Lopes HF, West M (2004) Bayesian model assessment in factor analysis. Stat Sin 14:41–67
Markos A, Iodice D’Enza A, Van de Velden M (2013) clustrd: methods for joint dimension reduction and clustering. R package version 0.1.2
Maugis C, Celeux G, Martin-Magniette M (2009) Variable selection in model-based clustering: a general variable role modeling. Comput Stat Data Anal 53(11):3872–3882
McLachlan GJ, Peel D (2000) Mixtures of factor analyzers. In: Proceedings of the seventh international conference on machine learning. San Francisco, Morgan Kaufmann, pp 599–606
McLachlan GJ, Peel D, Bean RW (2003) Modelling high-dimensional data by mixtures of factor analyzers. Comput Stat Data Anal 41:379–388
McLachlan GJ, Bean RW, Jones LB-T (2007) Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution. Comput Stat Data Anal 51(11):5327–5338
McNicholas SM, McNicholas PD, Browne RP (2013) Mixtures of variance-gamma distributions. Arxiv preprint arXiv:1309.2695
McNicholas PD, Murphy TB (2008) Parsimonious Gaussian mixture models. Stat Comput 18(3):285–296
McNicholas PD (2010) Model-based classification using latent Gaussian mixture models. J Stat Plan Inference 140(5):1175–1181
McNicholas PD, Murphy TB (2010) Model-based clustering of microarray expression data via latent Gaussian mixture models. Bioinformatics 26(21):2705–2712
McNicholas PD, Jampani KR, McDaid AF, Murphy TB, Banks L (2014) Pgmm: parsimonious Gaussian mixture models. R Package Version 1:1
Meng X, Van Dyk D (1997) The EM algorithm-an old folk song sung to a fast new tune. J R Stat Soc Ser B (Stat Methodol) 59(3):511–567
Montanari A, Viroli C (2011) Maximum likelihood estimation of mixtures of factor analyzers. Comput Stat Data Anal 55:2712–2723
Morris K, McNicholas PD, Scrucca L (2013) Dimension reduction for model-based clustering via mixtures of multivariate t-distributions. Adv Data Anal Classif 7(3):321–338
Morris K, McNicholas PD (2013) Dimension reduction for model-based clustering via mixtures of shifted asymmetric Laplace distributions. Stat Probab Lett 83(9):2088–2093
Murray PM, Browne RB, McNicholas PD (2013) Mixtures of ‘unrestricted’ skew-t factor analyzers. Arxiv preprint arXiv:1310.6224
Murray PM, Browne RB, McNicholas PD (2014a) Mixtures of skew-t factor analyzers. Comput Stat Data Anal 77:326–335
Murray PM, McNicholas PD, Browne RB (2014b) A mixture of common skew-\(t\) factor analyzers. Stat 3(1):68–82
O’Hagan A, Murphy TB, Gormley IC, McNicholas PD, Karlis D (2014) Clustering with the multivariate normal inverse Gaussian distribution. Comput Stat Data Anal. doi:10.1016/j.csda.2014.09.006
R Core Team (2014) R: a language and environment for statistical computing. In: R foundation for statistical computing. Vienna, Austria
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Ritter G (2014) Robust cluster analysis and variable selection. Chapman & Hall, Boca Raton
Rocci R, Gattone SA, Vichi M (2011) A new dimension reduction method: factor discriminant k-means. J Classif 28(2):210–226
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Steane MA, McNicholas PD, Yada R (2012) Model-based classification via mixtures of multivariate t-factor analyzers. Commun Stat-Simul Comput 41(4):510–523
Subedi S, McNicholas PD (2014) Variational Bayes approximations for clustering via mixtures of normal inverse Gaussian distributions. Adv Data Anal Classif 8(2):167–193
Tan PJ, Dowe DL (2005) MML inference of oblique decision trees. In: AI 2004: advances in artificial intelligence. Springer, Berlin, Heidelberg, pp 1082–1088
Timmerman ME, Ceulemans E, De Roover K, Van Leeuwen K (2013) Subspace K-means clustering. Behav Res Methods 45(4):1011–1023
Tortora C, Browne RP, Franczak BC, McNicholas PD (2015) MixGHD: model based clustering and classification using the mixture of generalized hyperbolic distributions. R Package Version 1:4
Vichi M, Kiers H (2001) Factorial k-means analysis for two way data. Comput Stat Data Anal 37:29–64
Vrbik I, McNicholas PD (2012) Analytic calculations for the EM algorithm for multivariate skew-mixture models. Stat Probab Lett 82(6):1169–1174
Vrbik I, McNicholas PD (2014) Parsimonious skew mixture models for model-based clustering and classification. Comput Stat Data Anal 71:196–210
Wang K, Ng A, McLachlan G (2013) EMMIXskew: the EM algorithm and skew mixture distribution. R Package Version 1:1
Wei Y, McNicholas PD (2014) Mixture model averaging for clustering. Adv Data Anal Classif. doi:10.1007/s11634-014-0182-6
Woodbury M (1950) Inverting modified matrices. In: Technical Report 42. Princeton University, Princeton
Acknowledgments
The authors are grateful to an associate editor and anonymous reviewers for their very helpful comments and suggestions, the cumulative effect of which has been a stronger manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by a grant-in-aid from Compusense Inc. as well as a Collaborative Research and Development grant from the Natural Sciences and Engineering Research Council of Canada.
Appendix: Updates for component covariance parameters
Appendix: Updates for component covariance parameters
At the second stage for our AECM algorithm, the (conditional) expected value of complete-data log-likelihood is given by
where \(C\) is constant with respect to \(\varvec{\varLambda }_g\) and \(\varvec{\varPsi }_g\). Differentiating \({Q}_2\) with respect to \(\varvec{\varLambda }_g\) gives
Note that \(\varvec{E}_{3ig}\) is a symmetric matrix. Now, solving \(S_1(\hat{\varvec{\varLambda }}_g,\varvec{\varPsi }_g)=\varvec{0}\) gives the update:
Differentiating \({Q}_2\) with respect to \(\varvec{\varPsi }_g^{-1}\) gives
Now, solving \(\text {diag}\{S_2(\hat{\varvec{\varLambda }}_g,\hat{\varvec{\varPsi }}_g)\}=\varvec{0}\) gives the update:
Rights and permissions
About this article
Cite this article
Tortora, C., McNicholas, P.D. & Browne, R.P. A mixture of generalized hyperbolic factor analyzers. Adv Data Anal Classif 10, 423–440 (2016). https://doi.org/10.1007/s11634-015-0204-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-015-0204-z