Mixtures of Hidden Truncation Hyperbolic Factor Analyzers

Murray, Paula M.; Browne, Ryan P.; McNicholas, Paul D.

doi:10.1007/s00357-019-9309-y

Mixtures of Hidden Truncation Hyperbolic Factor Analyzers

Published: 02 May 2019

Volume 37, pages 366–379, (2020)
Cite this article

Journal of Classification Aims and scope Submit manuscript

191 Accesses
3 Altmetric
Explore all metrics

Abstract

The mixture of factor analyzers model was first introduced over 20 years ago and, in the meantime, has been extended to several non-Gaussian analogs. In general, these analogs account for situations with heavy tailed and/or skewed clusters. An approach is introduced that unifies many of these approaches into one very general model: the mixture of hidden truncation hyperbolic factor analyzers (MHTHFA) model. In the process of doing this, a hidden truncation hyperbolic factor analysis model is also introduced. The MHTHFA model is illustrated for clustering as well as semi-supervised classification using two real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions

Article 02 September 2020

Gaussian mixture model with an extended ultrametric covariance structure

Article 25 February 2022

Automated learning of mixtures of factor analysis models with missing information

Article 29 January 2020

References

Aitken, A.C. (1926). A series formula for the roots of algebraic and transcendental equations. Proceedings of the Royal Society of Edinburgh, 45, 14–22.
MATH Google Scholar
Andrews, J.L., & McNicholas, P.D. (2011a). Extending mixtures of multivariate t-factor analyzers. Statistics and Computing, 21(3), 361–373.
MathSciNet MATH Google Scholar
Andrews, J.L., & McNicholas, P.D. (2011b). Mixtures of modified t-factor analyzers for model-based clustering, classification, and discriminant analysis. Journal of Statistical Planning and Inference, 141(4), 1479–1486.
MathSciNet MATH Google Scholar
Arellano-Valle, R.B., & Genton, M.G. (2005). On fundamental skew distributions. Journal of Multivariate Analysis, 96(1), 93–116.
MathSciNet MATH Google Scholar
Baek, J., McLachlan, G.J., Flack, L.K. (2010). Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1298–1309.
Google Scholar
Bhattacharya, S., & McNicholas, P.D. (2014). A LASSO-penalized BIC for mixture model selection. Advances in Data Analysis and Classification, 8(1), 45–61.
MathSciNet MATH Google Scholar
Bouveyron, C., & Brunet-Saumard, C. (2014). Model-based clustering of high-dimensional data: a review. Computational Statistics and Data Analysis, 71, 52–78.
MathSciNet MATH Google Scholar
Browne, R.P., & McNicholas, P.D. (2015). A mixture of generalized hyperbolic distributions. Canadian Journal of Statistics, 43(2), 176–198.
MathSciNet MATH Google Scholar
Franczak, B.C., Browne, R.P., McNicholas, P.D. (2014). Mixtures of shifted asymmetric Laplace distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1149–1157.
Google Scholar
Gallaugher, M.P.B., & McNicholas, P.D. (2017). A matrix variate skew-t distribution. Stat, 6, 160–170.
MathSciNet Google Scholar
Gallaugher, M.P.B., & McNicholas, P.D. (2018). Finite mixtures of skewed matrix variate distributions. Pattern Recognition, 80, 83–93.
Google Scholar
Gallaugher, M.P.B., & McNicholas, P.D. (2019a). On fractionally-supervised classification: weight selection and extension to the multivariate t-distribution. Journal of Classification 36. In press.
Gallaugher, M.P.B., & McNicholas, P.D. (2019b). Three skewed matrix variate distributions. Statistics and Probability Letters, 145, 103–109.
MathSciNet MATH Google Scholar
Ghahramani, Z., & Hinton, G.E. (1997). The EM algorithm for factor analyzers. Technical Report CRG-TR-96-1 University of Toronto, Toronto, Canada.
Gorman, R.P., & Sejnowski, T.J. (1988). Analysis of hidden units in a layered network trained to classify sonar targets. Neural Networks, 1, 75–89.
Google Scholar
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
MATH Google Scholar
Karlis, D., & Santourian, A. (2009). Model-based clustering with non-elliptically contoured distributions. Statistics and Computing, 19(1), 73–83.
MathSciNet Google Scholar
Lawley, D.N., & Maxwell, A.E. (1962). Factor analysis as a statistical method. Journal of the Royal Statistical Society: Series D, 12(3), 209–229.
Google Scholar
Lee, S., & McLachlan, G.J. (2014). Finite mixtures of multivariate skew t-distributions: some recent and new results. Statistics and Computing, 24, 181–202.
MathSciNet MATH Google Scholar
Lee, S.X., & McLachlan, G.J. (2016). Finite mixtures of canonical fundamental skew t-distributions: the unification of the restricted and unrestricted skew t-mixture models. Statistics and Computing, 26(3), 573–589.
MathSciNet MATH Google Scholar
Lichman, M. (2013). UCI machine learning repository. University of California, Irvine. School of Information and Computer Sciences.
Lin, T.-I. (2009). Maximum likelihood estimation for multivariate skew normal mixture models. Journal of Multivariate Analysis, 100, 257–265.
MathSciNet MATH Google Scholar
Lin, T.-I. (2010). Robust mixture modeling using multivariate skew t distributions. Statistics and Computing, 20(3), 343–356.
MathSciNet Google Scholar
Lin, T.-I., McNicholas, P.D., Hsiu, J.H. (2014). Capturing patterns via parsimonious t mixture models. Statistics and Probability Letters, 88, 80–87.
MathSciNet MATH Google Scholar
Lin, T., McLachlan, G.J., Lee, S.X. (2016). Extending mixtures of factor models using the restricted multivariate skew-normal distribution. Journal of Multivariate Analysis, 143, 398–413.
MathSciNet MATH Google Scholar
Lindsay, B.G. (1995). Mixture models: theory, geometry and applications. In NSF-CBMS regional conference series in probability and statistics, Vol. 5. Hayward: Institute of Mathematical Statistics.
McLachlan, G.J. (1992). Discriminant analysis and statistical pattern recognition. Hoboken: Wiley.
MATH Google Scholar
McLachlan, G.J., & Peel, D. (2000a). Finite mixture models. New York: Wiley.
MATH Google Scholar
McLachlan, G.J., & Peel, D. (2000b). Mixtures of factor analyzers. In Proceedings of the seventh international conference on machine learning (pp. 599–606). San Francisco: Morgan Kaufmann.
McNicholas, P.D. (2010). Model-based classification using latent Gaussian mixture models. Journal of Statistical Planning and Inference, 140(5), 1175–1181.
MathSciNet MATH Google Scholar
McNicholas, P.D. (2016a). Mixture model-based classification. Boca Raton: Chapman & Hall/CRC Press.
MATH Google Scholar
McNicholas, P.D. (2016b). Model-based clustering. Journal of Classification, 33 (3), 331–373.
MathSciNet MATH Google Scholar
McNicholas, P.D., & Murphy, T.B. (2008). Parsimonious Gaussian mixture models. Statistics and Computing, 18(3), 285–296.
MathSciNet Google Scholar
McNicholas, P.D., & Murphy, T.B. (2010). Model-based clustering of microarray expression data via latent Gaussian mixture models. Bioinformatics, 26(21), 2705–2712.
Google Scholar
McNicholas, S.M., McNicholas, P.D., Browne, R.P. (2017). A mixture of variance-gamma factor analyzers. In Ahmed, S.E. (Ed.) Big and complex data analysis: methodologies and applications (pp. 369–385). Cham: Springer International Publishing.
Meng, X.-L., & Rubin, D.B. (1993). Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika, 80, 267–278.
MathSciNet MATH Google Scholar
Murray, P.M., Browne, R.P., McNicholas, P.D. (2014a). Mixtures of skew-t factor analyzers. Computational Statistics and Data Analysis, 77, 326–335.
MathSciNet MATH Google Scholar
Murray, P.M., McNicholas, P.D., Browne, R.B. (2014b). A mixture of common skew-t factor analyzers. Stat, 3(1), 68–82.
MathSciNet MATH Google Scholar
Murray, P.M., Browne, R.P., McNicholas, P.D. (2017a). Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering. Journal of Multivariate Analysis, 161, 141–156.
MathSciNet MATH Google Scholar
Murray, P.M., Browne, R.P., McNicholas, P.D. (2017b). A mixture of SDB skew-t factor analyzers. Econometrics and Statistics, 3, 160–168.
MathSciNet Google Scholar
Murray, P.M., Browne, R.P., McNicholas, P.D. (2019). Note of Clarification on ‘Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering, by Murray, Browne, and McNicholas, J. Multivariate Analysis 161 (2017) 141–156.’ Journal of Multivariate Analysis, 171, 475–476.
Peel, D., & McLachlan, G.J. (2000). Robust mixture modelling using the t distribution. Statistics and Computing, 10(4), 339–348.
Google Scholar
R Core Team. (2018). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Google Scholar
Sahu, K., Dey, D.K., Branco, M.D. (2003). A new class of multivariate skew distributions with applications to Bayesian regression models. Canadian Journal of Statistics, 31(2), 129–150. Corrigendum: vol. 37 (2009), 301-?302.
MathSciNet MATH Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.
MathSciNet MATH Google Scholar
Steane, M.A., McNicholas, P.D., Yada, R. (2012). Model-based classification via mixtures of multivariate t-factor analyzers. Communications in Statistics – Simulation and Computation, 41(4), 510–523.
MathSciNet MATH Google Scholar
Steinley, D. (2004). Properties of the Hubert-Arabie adjusted Rand index. Psychological Methods, 9, 386–396.
Google Scholar
Subedi, S., & McNicholas, P.D. (2014). Variational Bayes approximations for clustering via mixtures of normal inverse Gaussian distributions. Advances in Data Analysis and Classification, 8(2), 167–193.
MathSciNet MATH Google Scholar
Tang, Y., Browne, R.P., McNicholas, P.D. (2018). Flexible clustering of high-dimensional data via mixtures of joint generalized hyperbolic distributions. Stat, 7 (1), e177.
MathSciNet Google Scholar
Tipping, M.E., & Bishop, C.M. (1999). Mixtures of probabilistic principal component analysers. Neural Computation, 11(2), 443–482.
Google Scholar
Tortora, C., McNicholas, P.D., Browne, R.P. (2016). A mixture of generalized hyperbolic factor analyzers. Advances in Data Analysis and Classification, 10(4), 423–440.
MathSciNet MATH Google Scholar
Tortora, C., Franczak, B.C., Browne, R.P., McNicholas, P.D. (2019). A mixture of coalesced generalized hyperbolic distributions. Journal of Classification, 36. To appear.
Vrbik, I., & McNicholas, P.D. (2012). Analytic calculations for the EM algorithm for multivariate skew-t mixture models. Statistics and Probability Letters, 82(6), 1169–1174.
MathSciNet MATH Google Scholar
Vrbik, I., & McNicholas, P.D. (2014). Parsimonious skew mixture models for model-based clustering and classification. Computational Statistics and Data Analysis, 71, 196–210.
MathSciNet MATH Google Scholar
Vrbik, I., & McNicholas, P.D. (2015). Fractionally-supervised classification. Journal of Classification, 32(3), 359–381.
MathSciNet MATH Google Scholar
Yoshida, R., Higuchi, T., Imoto, S. (2004). A mixed factors model for dimension reduction and extraction of a group structure in gene expression data. In Proceedings of the 2004 IEEE computational systems bioinformatics conference (pp. 161–172).

Download references

Author information

Authors and Affiliations

Department of Mathematics & Statistics, McMaster University, Hamilton, Ontario, L8S 4L8, Canada
Paula M. Murray & Paul D. McNicholas
Department of Statistics and Actuarial Sciences, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada
Ryan P. Browne

Authors

Paula M. Murray
View author publications
You can also search for this author in PubMed Google Scholar
Ryan P. Browne
View author publications
You can also search for this author in PubMed Google Scholar
Paul D. McNicholas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul D. McNicholas.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: E-Step Calculations

Herein, we present the expectations required for the E-step of the ECM algorithm for the mixtures of HTH factor analyzers model.

1.1 A.1 $\mathbb {E}[W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ and $\mathbb {E}[1/W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$

To derive the expectations $\mathbb {E}[W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ and $\mathbb {E}[1/W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ as well as $\mathbb {E}[\log W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ in the following section, first note that

$$ \begin{array}{lll} f(\!\!\!\!&w_{ig}\mid\mathbf{x}_{i},z_{ig}=1) =\frac{\boldsymbol{w}_{ig}^{\lambda_{g}-p/2-1}}{2K_{\lambda_{g}-p/2}(\sqrt{\omega_{g}(\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}))})}\left[ \frac{\omega_{g}}{\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right]^{(\lambda_{g}-p/2)/2}\\ &\times\exp\left\{\omega_{g} {w}_{ig}+\frac{\omega_{g}+\delta_{g}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{{w}_{ig}}\right\}{\Phi}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})/\sqrt{{w}_{ig}}\mid\boldsymbol{\Delta}_{g}\right)\\ &\div H_{r}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})\left( \frac{\omega_{g}}{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-(p/2),\gamma_{g},\gamma_{g}\right). \end{array} $$

(8)

Therefore,

$$ \begin{array}{lll} \mathbb{E}&\left[W_{ig}\mid\mathbf{x}_{i},z_{ig}=1 \right]\\ &={\int}^{\infty}_{0}\frac{w^{\lambda_{g}-p/2}}{2K_{\lambda_{g}-p/2}(\sqrt{\omega_{g}(\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}))})}\left[ \frac{\omega_{g}}{\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right]^{(\lambda_{g}-p/2)/2}\\ &\qquad\times\exp\left\{\omega_{g} w+\frac{\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{w}\right\}{\Phi}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})/\sqrt{w}\mid\boldsymbol{\Delta}_{g}\right)\\ &\qquad\div H_{r}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})\left( \frac{\omega_{g}}{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-(p/2),\gamma_{g},\gamma_{g}\right)dw\\ &=\frac{K_{\lambda_{g}-p/2+1}(\sqrt{\omega_{g}(\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}))})}{K_{\lambda_{g}-p/2}(\sqrt{\omega_{g}(\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}))})} \left[ \frac{\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{\omega_{g}}\right]^{1/2}\\ &\qquad\times H_{r}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})\left( \frac{\omega_{g}}{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-(p/2)+1,\gamma_{g},\gamma_{g}\right)\\ &\qquad\div H_{q}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})\left( \frac{\omega_{g}}{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-(p/2),\gamma_{g},\gamma_{g}\right), \end{array} $$

$$ \begin{array}{lll} \mathbb{E}&\left[1/W_{ig}\mid\mathbf{x}_{i},z_{ig}=1 \right]\\ &={\int}^{\infty}_{0}\frac{w^{\lambda_{g}-p/2-2}}{2K_{\lambda_{g}-p/2}(\sqrt{\omega_{g}(\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}))})}\left[ \frac{\omega_{g}}{\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right]^{(\lambda_{g}-p/2)/2}\\ &\qquad\times\exp\left\{\omega_{g} w+\frac{\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{w}\right\}{\Phi}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})/\sqrt{w}\mid\boldsymbol{\Delta}_{g}\right)\\ &\qquad\div H_{r}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})\left( \frac{\omega_{g}}{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-(p/2),\gamma_{g},\gamma_{g}\right)dw\\ &=\frac{K_{\lambda_{g}-p/2-1}(\sqrt{\omega_{g}(\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}))})}{K_{\lambda_{g}-p/2}(\sqrt{\omega_{g}(\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}))})} \left[ \frac{\omega_{g}+\delta(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{\omega_{g}}\right]^{-1/2}\\ &\qquad\times H_{r}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})\left( \frac{\omega_{g}}{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-(p/2)-1,\gamma_{g},\gamma_{g}\right)\\ &\qquad\div H_{q}\left( \boldsymbol{\alpha}_{g}^{\prime}\boldsymbol{\Omega}_{g}^{-1}(\mathbf{x}_{i}-\mathbf{r}_{g})\left( \frac{\omega_{g}}{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-(p/2),\gamma_{g},\gamma_{g}\right). \end{array} $$

1.2 A.2 $\mathbb {E}[\log W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$

To update $\mathbb {E}[\log W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$, where W_ig ∼ GIG(ψ_g, χ_g, λ_g), first note that

$$ \mathbb{E}[ \log W_{ig}\mid z_{ig}=1] = \frac{ \mathrm{d} }{ \mathrm{d} \lambda } \log K_{\lambda} \left( \sqrt{ \chi_{g} \psi_{g} } \right) + \log \left( \sqrt{ \frac{\chi_{g}}{\psi_{g}} } \right). $$

We can show that

$$ W_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},z_{ig}=1\sim \text{GIG}(\omega_{g},\omega_{g} + (\mathbf{v}_{ig}-\mathbf{k}_{g})^{\prime}\boldsymbol{\Delta}_{g}^{-1}(\mathbf{v}_{ig}-\mathbf{k}_{g})+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g}),{\lambda},_{g}-(p+r)/2)), $$

where $\mathbf {r}_{g}=\boldsymbol {\mu }_{g}-\boldsymbol {\alpha }_{g}\mathbf {a}_{\lambda _{g}}$ and $\mathbf {k}_{g}=\boldsymbol {\Lambda }^{\prime }_{g}\boldsymbol {\Omega }_{g}^{-1}(\mathbf {x}_{i}-\boldsymbol {\mu }_{g})$. Therefore,

$$ \mathbb{E}[\log W_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},z_{ig}=1] = \frac{ \mathrm{d} }{ \mathrm{d} \tau } \log K_{\tau} \left( \sqrt{ \chi^{*} \psi^{*} } \right) + \log \left( \sqrt{ \frac{\chi^{*}}{\psi^{*}} } \right). $$

Let

$$ \zeta_{ig} = \sqrt{ 1 + \frac{\delta\left( \mathbf{x}_{i}\mid \boldsymbol{\mu}_{g}, \boldsymbol{\Sigma}_{g}\right) + (\mathbf{v}_{ig} - \mathbf{k}_{g})^{\prime} \boldsymbol{\Delta}_{g}^{-1} (\mathbf{v}_{ig} - \mathbf{k}_{g}) }{ \omega_{g} } }, $$

then ζ_ig ≥ 1 and $W_{ig}\mid \mathbf {x}_{i}, \mathbf {v}_{ig},z_{ig}=1\sim \text {GIG}(\omega _{g}, \omega _{g} \zeta _{ig}^{2}, \tau )$. Consequently,

$$ \mathbb{E}[ \log W_{ig} \mid \mathbf{x}_{i}, \mathbf{v}_{ig},z_{ig}=1] = \frac{ \mathrm{d} }{ \mathrm{d} \tau } \log K_{\tau} \left( \omega_{g} \zeta_{ig}\right) + \log\zeta_{ig}. $$

The reader is directed to the supplementary material in Murray et al. (2017a) for details on a method for estimating this expectation via a series expansion.

1.3 A.3 $\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ and $\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mathbf {V}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]$

Recall that V_ig∣w_ig, z_ig = 1 ∼ HN_r(w_igI_r). We can show that

$$ \begin{array}{lll} f(\mathbf{v}_{ig}\mid\mathbf{x}_{i},z_{ig}=1)=\frac{1}{c_{\lambda}} h_{r}\left( \mathbf{v}_{ig}~|~\mathbf{k}_{g},\sqrt{\frac{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{\omega_{g}}}\boldsymbol{\Delta}_{g},\lambda_{g}-\frac{p}{2},\gamma_{g},\gamma_{g}\right), \end{array} $$

(9)

where the support of V_ig is $\mathbb {R}_{+}^{r}$, i.e., the positive plane of $\mathbb {R}_{r}$ and

$$c_{\lambda}= H_{r}\left( \mathbf{k}\left( \frac{\omega}{\omega+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}\right)^{1/4}|\mathbf{0},\boldsymbol{\Delta}_{g},\lambda_{g}-\frac{p}{2},\gamma_{g},\gamma_{g} \right).$$

It follows that

$$\mathbf{V}_{ig}\mid w_{ig},\mathbf{x}_{i},z_{ig}=1 \sim \text{TH}_{r}\left( \mathbf{k}_{g}, \sqrt{\frac{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{\omega_{g}}}\boldsymbol{\Delta}_{g},\lambda_{g}-\frac{p}{2}, \gamma_{g},\gamma_{g});\mathbb{R}_{+}^{r}\right).$$

Here, $\text {TH}_{r}(\boldsymbol {\mu },\mathbf {\Sigma }, \lambda ,\psi ,\chi ;\mathbb {R}_{+}^{r})$ denotes the r-dimensional symmetric truncated hyperbolic distribution with density

$$f_{\text{TH}}(\mathbf{v}\mid\boldsymbol{\mu},\boldsymbol{\Sigma},\lambda,\psi,\chi;\mathbb{R}_{+}^{r})= \frac{h_{r}(\mathbf{v}\mid\boldsymbol{\mu},\boldsymbol{\Sigma},\lambda,\psi,\chi)}{{\int}^{\infty}_{0}\ldots {\int}^{\infty}_{0}h_{r}(\mathbf{v}\mid\boldsymbol{\mu},\boldsymbol{\Sigma},\lambda,\psi,\chi)d\mathbf{v}}\mathbb{I}_{\mathbb{R}_{+}^{r}}(\mathbf{v}),$$

and $\mathbb {I}_{\mathbb {R}_{+}^{r}}(\mathbf {u})=1$ when $\mathbf {u}\in \mathbb {R}_{+}^{r}$ and 0 otherwise. In this way, the symmetric hyperbolic distribution is truncated to exist only within with region $\mathbb {R}_{+}^{r}$. To update $\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ and $\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mathbf {V}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]$, we can make use of the fact that

$$\mathbb{E}[(1/W_{ig})\mathbf{V}_{ig}\mid \mathbf{x}_{i},z_{ig}=1]=\mathbb{E}[(1/W_{ig})\mid \mathbf{x}_{i},z_{ig}=1]\mathbb{E}[\mathbf{Y}_{ig}\mid \mathbf{x}_{i},z_{ig}=1]$$

and

$$\mathbb{E}[(1/W_{ig})\mathbf{V}_{ig}\mathbf{V}_{ig}^{\prime}\mid \mathbf{x}_{i},z_{ig}=1]=\mathbb{E}[(1/W_{ig})\mid \mathbf{x}_{i},z_{ig}=1]\mathbb{E}[\mathbf{Y}_{ig}\mathbf{Y}_{ig}^{\prime}\mid \mathbf{x}_{i},z_{ig}=1],$$

where

$$\mathbf{Y}_{ig}\mid w_{ig},\mathbf{x}_{i} ,z_{ig}=1\sim \text{TH}_{r}\left( \mathbf{k}_{g}, \sqrt{\frac{\omega_{g}+\boldsymbol{\delta}(\mathbf{x}_{i}\mid\mathbf{r}_{g},\boldsymbol{\Omega}_{g})}{\omega_{g}}}\boldsymbol{\Delta}_{g},\lambda_{g}-\frac{p}{2}-1, \gamma_{g},\gamma_{g};\mathbb{R}_{+}^{r}\right).$$

The expectations $\mathbb {E}[\mathbf {Y}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ and $\mathbb {E}[\mathbf {Y}_{ig}\mathbf {Y}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]$ can easily be estimated using the moments of the truncated symmetric hyperbolic distribution defined in Murray et al. (2017a).

1.4 A.4 $\mathbb {E}[(1/W_{ig})\tilde {\mathbf {U}}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]$ and $\mathbb {E}[(1/W_{ig})\tilde {\mathbf {U}}_{ig}\tilde {\mathbf {U}}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]$

Note that $\tilde {\mathbf {U}}_{ig}\mid \mathbf {x}_{i},\mathbf {v}_{ig},w_{ig},z_{ig}=1\sim \mathcal {N}_{q}(\mathbf {q},w_{ig}\mathbf {C})$ where $\mathbf {q}=\mathbf {C}[\mathbf {d}+\mathbf {{\Lambda }}_{g}(\mathbf {V}_{ig}-\mathbf {a}_{\lambda _{g}})]$, $\mathbf {d}=\tilde {\mathbf {B}}_{g}^{\prime }\mathbf {D}_{g}^{-1}(\mathbf {X}_{i}-\boldsymbol {\mu }_{g})$, and $\mathbf {C}=(\mathbf {I}_{q}+\tilde {\mathbf {B}}_{g}^{\prime }\mathbf {D}_{g}^{-1}\tilde {\mathbf {B}}_{g})^{-1}$. We can show

$$ \begin{array}{lll} &&\mathbb{E}[\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},z_{ig}=1] =\mathbb{E}\{\mathbb{E}[\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},w_{ig},z_{ig}=1]\mid\mathbf{x}_{i} ,z_{ig}=1\}\\ &=&\mathbb{E}\{\mathbf{C}[ \mathbf{d}+\mathbf{{\Lambda} }_{g}(\mathbf{V}_{ig}-\mathbf{a}_{\lambda_{g}})]\mid\mathbf{x}_{i},z_{ig}=1\} =\mathbf{C}\{ \mathbf{d}+\mathbf{{\Lambda} }_{g}(\mathbb{E}[\mathbf{V}_{ig}\mid\mathbf{x}_{i},z_{ig}=1]-\mathbf{a}_{\lambda_{g}})\}. \end{array} $$

Therefore, it follows that

$$ \begin{array}{lll} \mathbb{E}&[(1/W_{ig})\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},z_{ig}=1] =\mathbb{E}\{\mathbb{E}[(1/W_{ig}) \tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},w_{ig},z_{ig}=1]\mid\mathbf{x}_{i},z_{ig}=1 \}\\ &=\mathbb{E}\{(1/W_{ig})[ \mathbf{C}\mathbf{d}+\mathbf{C}\mathbf{{\Lambda} }_{g}(\mathbf{V}_{ig}-\mathbf{a}_{\lambda_{g}})]\mid\mathbf{x}_{i},z_{ig}=1\}\\ &=\mathbf{C}\{\mathbf{d}\mathbb{E}[1/W_{ig}\mid\mathbf{x}_{i},z_{ig}=1]+\mathbf{{\Lambda} }_{g}(\mathbb{E}[(1/W_{ig})\mathbf{V}_{ig}\mid \mathbf{x}_{i},z_{ig}=1]-\mathbf{a}_{\lambda_{g}}\mathbb{E}[1/W_{ig}\mid\mathbf{x}_{i},z_{ig}=1])\},\\ \mathbb{E}&[(1/W_{ig})\mathbf{V}_{ig}\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i} ,z_{ig}=1] =\mathbb{E}\{\mathbb{E}[(1/W_{ig})\mathbf{V}_{ig} \tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},w_{ig},z_{ig}=1]\mid\mathbf{x}_{i},z_{ig}=1\}\\ &=\mathbb{E}\{(1/W_{ig})\mathbf{V}_{ig}[ \mathbf{C}\mathbf{d}+\mathbf{C}\mathbf{{\Lambda} }_{g}(\mathbf{V}_{ig}-\mathbf{a}_{\lambda_{g}})]\mid\mathbf{x}_{i} ,z_{ig}=1\}\\ &=\mathbf{C} \{ \mathbf{d} \mathbb{E}[(1/W_{ig})\mathbf{V}_{ig}\mid\mathbf{x}_{i},z_{ig}=1]+\mathbf{{\Lambda} }_{g}(\mathbb{E}[(1/W_{ig})\mathbf{V}_{ig}\mathbf{V}_{ig}^{\prime}\mid \mathbf{x}_{i},z_{ig}=1]\\&\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad- \mathbf{a}_{\lambda_{g}}\mathbb{E}[(1/W_{ig})\mathbf{V}_{ig}\mid\mathbf{x}_{i},z_{ig}=1])\}, \end{array} $$

and

$$ \begin{array}{lll} \mathbb{E}&[(1/W_{ig})\tilde{\mathbf{U}}_{ig}\tilde{\mathbf{U}}_{ig}^{\prime}\mid\mathbf{x}_{i},z_{ig}=1] =\mathbb{E}\{(1/W_{ig})\mathbb{E}[\tilde{\mathbf{U}}_{ig}\tilde{\mathbf{U}}_{ig}^{\prime}\mid\mathbf{x}_{i},\mathbf{v}_{ig},w_{ig},z_{ig}=1]\mid\mathbf{x}_{i},z_{ig}=1\}\\ &=\mathbb{E}\{(1/W_{ig})(\mathbb{E}[\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},w_{ig},z_{ig}=1]\mathbb{E}[\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},w_{ig},z_{ig}=1]^{\prime}+W_{ig}\mathbf{C})\mid\mathbf{x}_{i} ,z_{ig}=1\}\\ &=\mathbb{E}\{(1/W_{ig})(\mathbb{E}[\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},\mathbf{v}_{ig},w_{ig},z_{ig}=1][\mathbf{C}\mathbf{d}+\mathbf{C}\mathbf{{\Lambda} }_{g}(\mathbf{V}_{ig}-\mathbf{a}_{\lambda_{g}})]^{\prime})+\mathbf{C}\mid\mathbf{x}_{i},z_{ig}=1 \}\\ &=\{ (\mathbb{E}[(1/W_{ig})\mathbf{V}_{ig} \tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},z_{ig}=1]-\mathbf{a}_{\lambda_{g}}\mathbb{E}[(1/W_{ig})\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},z_{ig}=1])\mathbf{{\Lambda}}_{g}^{\prime}\\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad+\mathbb{E}[(1/W_{ig})\tilde{\mathbf{U}}_{ig}\mid\mathbf{x}_{i},z_{ig}=1]\mathbf{d}^{\prime}+\mathbf{I}_{q} \}\mathbf{C}. \end{array} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Murray, P.M., Browne, R.P. & McNicholas, P.D. Mixtures of Hidden Truncation Hyperbolic Factor Analyzers. J Classif 37, 366–379 (2020). https://doi.org/10.1007/s00357-019-9309-y

Download citation

Published: 02 May 2019
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00357-019-9309-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Mixtures of Hidden Truncation Hyperbolic Factor Analyzers

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions

Gaussian mixture model with an extended ultrametric covariance structure

Automated learning of mixtures of factor analysis models with missing information

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix: E-Step Calculations

1.1 A.1 \(\mathbb {E}[W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\) and \(\mathbb {E}[1/W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\)

1.2 A.2 \(\mathbb {E}[\log W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\)

1.3 A.3 \(\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\) and \(\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mathbf {V}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]\)

1.4 A.4 \(\mathbb {E}[(1/W_{ig})\tilde {\mathbf {U}}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\) and \(\mathbb {E}[(1/W_{ig})\tilde {\mathbf {U}}_{ig}\tilde {\mathbf {U}}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]\)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Mixtures of Hidden Truncation Hyperbolic Factor Analyzers

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions

Gaussian mixture model with an extended ultrametric covariance structure

Automated learning of mixtures of factor analysis models with missing information

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix: E-Step Calculations

Appendix: E-Step Calculations

1.1 A.1 \(\mathbb {E}[W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\) and \(\mathbb {E}[1/W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\)

1.2 A.2 \(\mathbb {E}[\log W_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\)

1.3 A.3 \(\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\) and \(\mathbb {E}[(1/W_{ig})\mathbf {V}_{ig}\mathbf {V}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]\)

1.4 A.4 \(\mathbb {E}[(1/W_{ig})\tilde {\mathbf {U}}_{ig}\mid \mathbf {x}_{i},z_{ig}=1]\) and \(\mathbb {E}[(1/W_{ig})\tilde {\mathbf {U}}_{ig}\tilde {\mathbf {U}}_{ig}^{\prime }\mid \mathbf {x}_{i},z_{ig}=1]\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation