Cluster-weighted $$t$$ -factor analyzers for robust model-based clustering and dimension reduction

Subedi, Sanjeena; Punzo, Antonio; Ingrassia, Salvatore; McNicholas, Paul D.

doi:10.1007/s10260-015-0298-7

Cluster-weighted $t$-factor analyzers for robust model-based clustering and dimension reduction

Published: 01 March 2015

Volume 24, pages 623–649, (2015)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

Sanjeena Subedi¹,
Antonio Punzo²,
Salvatore Ingrassia² &
…
Paul D. McNicholas³

511 Accesses
43 Citations
Explore all metrics

Abstract

Cluster-weighted models represent a convenient approach for model-based clustering, especially when the covariates contribute to defining the cluster-structure of the data. However, applicability may be limited when the number of covariates is high and performance may be affected by noise and outliers. To overcome these problems, common/uncommon $t$-factor analyzers for the covariates, and a $t$-distribution for the response variable, are here assumed in each mixture component. A family of twenty parsimonious variants of this model is also presented and the alternating expectation-conditional maximization algorithm, for maximum likelihood estimation of the parameters of all models in the family, is described. Artificial and real data show that these models have very good clustering performance and that the algorithm is able to recover the parameters very well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Factor and hybrid components for model-based clustering

Article 17 January 2022

Advances in Robust Constrained Model Based Clustering

Gaussian parsimonious clustering models with covariates and a noise component

Article 20 September 2019

References

Airoldi J, Hoffmann R (1984) Age variation in voles (Microtus californicus, M. ochrogaster) and its significance for systematic studies. Occasional papers of the Museum of Natural History 111, University of Kansas, Lawrence, KS
Aitken AC (1926) On Bernoulli’s numerical solution of algebraic equations. In: Proceedings of the Royal Society of Edinburgh, vol 46, pp 289–305
Andrews JL, McNicholas PD (2011) Extending mixtures of multivariate $t$-factor analyzers. Stat Comput 21(3):361–373
Article MathSciNet Google Scholar
Baek J, McLachlan GJ (2011) Mixtures of common $t$-factor analyzers for clustering high-dimensional microarray data. Bioinformatics 27(9):1269–1276
Article Google Scholar
Baek J, McLachlan GJ, Flack LK (2010) Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data. IEEE Trans Pattern Anal Mach Intell 32(7):1298–1309
Article Google Scholar
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725
Article Google Scholar
Böhning D, Dietz E, Schaub R, Schlattmann P, Lindsay B (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46(2):373–388
Article MATH Google Scholar
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38
MATH MathSciNet Google Scholar
DeSarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5(2):249–282
Article MATH MathSciNet Google Scholar
Ehrlich I (1973) Participation in illegitimate activities: a theoretical and empirical investigation. J Polit Econ 81(3):521–565
Article Google Scholar
Flury B (1997) A first course in multivariate statistics. Springer, New York
Book MATH Google Scholar
Gershenfeld N (1997) Nonlinear inference and cluster-weighted modeling. Ann NY Acad Sci 808(1):18–24
Article Google Scholar
Grün B, Leisch F (2008) FlexMix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35
Article Google Scholar
Hennig C (2000) Identifiablity of models for clusterwise linear regression. J Classif 17(2):273–296
Article MATH MathSciNet Google Scholar
Ingrassia S, Minotti SC, Vittadini G (2012) Local statistical modeling via the cluster-weighted approach with elliptical distributions. J Classif 29(3):363–401
Article MathSciNet Google Scholar
Ingrassia S, Minotti SC, Punzo A (2014) Model-based clustering via linear cluster-weighted models. Comput Stat Data Anal 71:159–182
Article MathSciNet Google Scholar
Ingrassia S, Punzo A, Vittadini G (2015) The generalized linear mixed cluster-weighted model. J Classif 32 (in press)
Leisch F (2004) FlexMix: a general framework for finite mixture models and latent class regression in ${\sf R}$. J Stat Softw 11(8):1–18
Article Google Scholar
Lindsay BG (1995) Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward
Lo Y (2008) A likelihood ratio test of a homoscedastic normal mixture against a heteroscedastic normal mixture. Stat Comput 18(3):233–240
Article MathSciNet Google Scholar
McLachlan GJ (1987) On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. J R Stat Soc Ser C (Appl Stat) 36(3):318–324
MathSciNet Google Scholar
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Book MATH Google Scholar
McLachlan GJ, Bean RW, Ben-Tovim Jones L (2007) Extension of the mixture of factor analyzers model to incorporate the multivariate $t$-distribution. Comput Stat Data Anal 51(11):5327–5338
Article MATH MathSciNet Google Scholar
Meng XL, van Dyk D (1997) The EM algorithm—an old folk-song sung to a fast new tune. J R Stat Soc Ser B (Stat Methodol) 59(3):511–567
Article MATH Google Scholar
Punzo A (2014) Flexible mixture modeling with the polynomial Gaussian cluster-weighted model. Stat Model 14(3):257–291
Article MathSciNet Google Scholar
Punzo A, Ingrassia S (2015) Parsimonious generalized linear Gaussian cluster-weighted models. In: Morlini I, Minerva T, Palumbo F (eds) Advances in statistical models for data analysis, studies in classification, data analysis and knowledge organization. Springer, Switzerland
Punzo A, Browne RP, McNicholas PD (2014) Hypothesis testing for parsimonious Gaussian mixture models. http://arxiv.org/abs/1405.0377
R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Sakamoto Y, Ishiguro M, Kitagawa G (1983) Akaike information criterion statistics. Reidel, Boston
Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MATH Google Scholar
Subedi S, Punzo A, Ingrassia S, McNicholas PD (2013) Clustering and classification via cluster-weighted factor analyzers. Adv Data Anal Classif 7(1):5–40
Article MATH MathSciNet Google Scholar
Vandaele W (1987) Participation in illegitimate activities: Ehrlich revisited. Report, U.S. Department of Justice, National Institute of Justice

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Guelph, Guelph, ON, N1G 2W1, Canada
Sanjeena Subedi
Department of Economics and Business, University of Catania, Catania, Italy
Antonio Punzo & Salvatore Ingrassia
Department of Mathematics and Statistics, McMaster University, Hamilton, ON, L8S 4L8, Canada
Paul D. McNicholas

Authors

Sanjeena Subedi
View author publications
You can also search for this author inPubMed Google Scholar
Antonio Punzo
View author publications
You can also search for this author inPubMed Google Scholar
Salvatore Ingrassia
View author publications
You can also search for this author inPubMed Google Scholar
Paul D. McNicholas
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sanjeena Subedi.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 195 KB)

Appendices

Appendix 1: Estimation of the parameters for the linear $t$ CWM with $t$-factor analyzers using the AECM algorithm

Let ${\mathcal {S}}=\{({\varvec{x}}_i',y_i)';i=1,\ldots ,n\}$ be a sample of size $n$. In the EM framework (Dempster et al. 1977), the generic observation $({\varvec{x}}_i',y_i)'$ is viewed as being incomplete and its complete counterpart is given by $({\varvec{x}}_i',y_i,{\varvec{u}}_{ig}',w_{ig},\varvec{z}_i')',$ where $\varvec{z}_i$ is the component-label vector such that $z_{ig}=1$ if $({\varvec{x}}_i',y_i)'$ comes from component $g$ and $z_{ig}=0$ otherwise. The idea of the AECM algorithm (Meng and van Dyk 1997) is to partition ${\varvec{\theta }}$, say ${\varvec{\theta }}=({\varvec{\theta }}'_1,{\varvec{\theta }}'_2)'$, in such a way that the likelihood is easy to maximize for ${\varvec{\theta }}_1$ given ${\varvec{\theta }}_2$ and vice versa. The AECM algorithm consists of two cycles, with an E-step and a CM-step for each cycle. The two CM-steps correspond to the partition of ${\varvec{\theta }}$ into ${\varvec{\theta }}_1$ and ${\varvec{\theta }}_2$. The algorithm is iterated until convergence.

1.1 First cycle

Here, ${\varvec{\theta }}_1=\{\pi _g,{\varvec{\mu }}_g,{\varvec{\beta }}_g,\sigma ^2_g,\nu _g;g=1, \ldots , G\}$, where the missing data are the unobserved group labels $\varvec{z}_i$, $i=1,\ldots ,n$, and $w_{ig}$. The complete-data likelihood is

$$\begin{aligned} L_{c1}({\varvec{\theta }}_1)= & {} \prod _{i=1}^n\prod _{g=1}^G \Bigg [ \pi _g \phi \left( y_i|\varvec{x}_i,w_{ig};\mu (\varvec{x}_i;{\varvec{\beta }}_g),\displaystyle \frac{\sigma ^2_g}{w_{ig}}\right) \phi \left( \varvec{x}_i|w_{ig};{\varvec{\mu }}_g,\displaystyle \frac{{\varvec{\varSigma }}_g}{w_{ig}}\right) \\&\times \,\kappa \left( w_{ig};\frac{\nu _g}{2},\frac{\nu _g}{2}\right) \Bigg ]^{z_{ig}}. \end{aligned}$$

The E-step on the first cycle of the $(k+1)$th iteration requires the calculation of $Q_1({\varvec{\theta }}_1; {\varvec{\theta }}^{(k)}) = {\mathbb {E}}_{{\varvec{\theta }}^{(k)}}\left[ l_{c1} ({\varvec{\theta }}_1)|{\mathcal {S}}\right] $, the expected complete-data log-likelihood given the observed data and ${\varvec{\theta }}^{(k)}$. In practice, it requires calculating ${\mathbb {E}}_{{\varvec{\theta }}^{(k)}}\left[ Z_{ig}|{\mathcal {S}}\right] $ and ${\mathbb {E}}_{{\varvec{\theta }}^{(k)}}\left[ W_{ig}|{\mathcal {S}}\right] $; this step is achieved by respectively replacing each $z_{ig}$ by $z_{ig}^{(k+1)} = \frac{\tau _{ig}}{\sum _{j=1}^G\tau _{ij}},$ where

$$\begin{aligned} \tau _{ig}= & {} \pi _j^{(k)}\phi \left( y_i|{\varvec{x}}_i,w_{ig};\mu \left( \varvec{x}_i; {\varvec{\beta }}_g^{(k)}\right) ,\frac{\sigma _g^{2(k)}}{w_{ig}}\right) \phi \left( {\varvec{x}}_i|w_{ig};{\varvec{\mu }}_g^{(k)},\frac{{\varvec{\varSigma }}^{(k)}_g}{w_{ig}}\right) \\&\times \, \kappa \left( w_{ig};\frac{\nu _g^{(k)}}{2},\frac{\nu _g^{(k)}}{2}\right) \end{aligned}$$

and each $w_{ig}$ by

$$\begin{aligned} w_{ig}^{(k+1)}=\frac{v_g^{(k)}+p+1}{v_g^{(k)}+\left( {\varvec{x}}-{\varvec{\mu }}_g^{(k)} \right) '\left( {\varvec{\varSigma }}^{(k)}_g\right) ^{-1} \left( {\varvec{x}}-{\varvec{\mu }}_g^{(k)}\right) + \displaystyle \frac{\left( y-\beta _{0g}^{(k)}-{\varvec{\beta }}_{1g}^{(k)'}{\varvec{x}}\right) ^2}{\sigma _g^{2(k)}}}, \end{aligned}$$

where ${\varvec{\varSigma }}^{(k)}_g={\varvec{\varLambda }}_g^{(k)}{\varvec{\varLambda }}_g^{(k)'}+{\varvec{\varPsi }}_g^{(k)}$. For the M-step, the maximization of $L_{c1}({\varvec{\theta }}_1)$ yields

$$\begin{aligned}&\displaystyle \pi _g^{(k+1)} = \frac{1}{n}\sum _{i=1}^n z_{ig}^{(k+1)}\nonumber \\&\displaystyle {\varvec{\mu }}_g^{(k+1)} = \frac{\sum _{i=1}^n z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{x}}_i }{\sum _{i=1}^n z_{ig}^{(k+1)}w_{ig}^{(k+1)} } \nonumber \end{aligned}$$

$$\begin{aligned} {\varvec{\beta }}_{1g}^{(k+1)}&= \left[ \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}} \sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} y_i \left( {\varvec{x}}_i - {\varvec{\mu }}_g^{(k+1)} \right) \right] \\&\quad \times \left[ \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}}\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} {\varvec{x}}'_i{\varvec{x}}_i - {\varvec{\mu }}_g^{(k+1)'}{\varvec{\mu }}_g^{(k+1)}\right] ^{-1}\\ \beta _{0g}^{(k+1)}&= \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}}\displaystyle \sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} y_i-{\varvec{\beta }}_{1g}^{(k+1)'}{\varvec{\mu }}_g^{(k+1)}\\ \sigma _g^{2(k+1)}&= \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}} \sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} \left[ y_i-\left( \beta _{0g}^{(k+1)}+{\varvec{\beta }}_{1g}^{(k+1)'}{\varvec{x}}_i \right) \right] ^2. \end{aligned}$$

With regard to updating for the degrees of freedom $v_g$, their formulation does not exist in closed form. McLachlan et al. (2007) show that $v_g^{(k+1)}$ is the solution of the equation

$$\begin{aligned}&\Bigg \{ -\psi \left( \displaystyle \frac{v_g}{2}\right) +\ln \left( \displaystyle \frac{v_g}{2} \right) +1+ \displaystyle \frac{1}{n_g^{(k+1)}}\displaystyle \sum _{i=1}^nz_{ig}^{(k+1)}\left[ \ln \left( w_{ig}^{(k+1)}\right) -w_{ig}^{(k+1)}\right]&\nonumber \\&\quad +\psi \left( \displaystyle \frac{v_g^{(k)}+p}{2} \right) -\ln \left( \displaystyle \frac{v_g^{(k)}+p}{2} \right) \Bigg \}=0,&\end{aligned}$$

(6)

where $\psi (\cdot )$ denotes the digamma function and $n_g^{(k+1)}=\sum _{i=1}^nz_{ig}^{(k+1)}$. Following the notation in McLachlan and Peel (2000), we set ${\varvec{\theta }}^{(k+1/2)}=\{{\varvec{\theta }}_1^{(k+1)},{\varvec{\theta }}_2^{(k)}\}$.

1.2 Second cycle

Here, ${\varvec{\theta }}_2=\left\{ {\varvec{\varSigma }}_g ; g=1, \ldots , G \right\} = \left\{ {\varvec{\varLambda }}_g,{\varvec{\varPsi }}_g;g=1, \ldots , G\right\} $, where the missing data are the unobserved group labels $\varvec{z}_i$ and the latent factors $\varvec{u}_{ig}$, $i=1,\ldots ,n$ and $g=1,\ldots ,G$. Therefore, the complete-data likelihood is

$$\begin{aligned} L_{c2}({\varvec{\theta }}_2)&= \prod _{i=1}^n\prod _{g=1}^G \Bigg [\pi _g^{(k+1)} \phi \left( y_i|{\varvec{x}}_i,w_{ig}; \mu \left( {\varvec{x}}_i;{\varvec{\beta }}_g^{(k+1)}\right) ,\sigma _g^{2(k+1)}\right) \\&\times \,\phi \left( {\varvec{x}}_i|{\varvec{u}}_{ig},w_{ig};{\varvec{\mu }}_g^{(k+1)}+{\varvec{\varLambda }}_g {\varvec{u}}_{ig},\frac{{\varvec{\varPsi }}_g}{w_{ig}}\right) \\&\times \, \phi \left( {\varvec{u}}_{ig}|w_{ig};{\varvec{0}},\frac{{\varvec{I}}_q}{w_{ig}}\right) \kappa \left( w_{ig};\frac{\nu _g^{(k+1)}}{2},\frac{\nu _g^{(k+1)}}{2} \right) \Bigg ]^{z_{ig}}, \end{aligned}$$

where $Y$ is conditionally independent of ${\varvec{U}}$ given ${\varvec{X}}={\varvec{x}}$.

The E-step on the second cycle of the $(k+1)$th iteration requires the calculation of $Q_2({\varvec{\theta }}_2; {\varvec{\theta }}^{(k+1/2)}) = {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} \left[ l_{c2} ({\varvec{\theta }}_2)|{\mathcal {S}}\right] $. This involves calculating the following conditional expectations: ${\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}(Z_{ig}|{\mathcal {S}})$, ${\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}(W_{ig}|{\mathcal {S}})$, ${\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}(Z_{ig} W_{ig}{\varvec{U}}_{ig} | {\mathcal {S}})$, and ${\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} (Z_{ig}W_{ig}{\varvec{U}}_{ig} {\varvec{U}}'_{ig} |{\mathcal {S}})$. We have

$$\begin{aligned} {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} (Z_{ig}W_{ig} {\varvec{U}}_{ig} | {\mathcal {S}}) =z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\gamma }}_g^{(k)}\left( {\varvec{x}}_i- {\varvec{\mu }}_g^{(k+1)}\right) \end{aligned}$$

and

$$\begin{aligned} {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} (Z_{ig}W_{ig}{\varvec{U}}_{ig} {\varvec{U}}'_{ig} | {\mathcal {S}}) = z_{ig}^{(k+1)} w_{ig}^{(k+1)}{\varvec{\varTheta }}^{(k)}_g, \end{aligned}$$

where

$$\begin{aligned} {\varvec{S}}_g^{(k+1)}= & {} \frac{1}{n_g^{(k+1)}}\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} \left( {\varvec{x}}_i - {\varvec{\mu }}_g^{(k+1)} \right) \left( {\varvec{x}}_i - {\varvec{\mu }}_g^{(k+1)} \right) ' \nonumber \\ {\varvec{\gamma }}^{(k)}_g= & {} {\varvec{\varLambda }}^{(k)'}_g \left( {\varvec{\varLambda }}^{(k)}_g{\varvec{\varLambda }}^{(k)'}_g+{\varvec{\varPsi }}^{(k)}_g \right) ^{-1} \nonumber \\ {\varvec{\varTheta }}^{(k)}_g= & {} {\varvec{I}}_q-{\varvec{\gamma }}^{(k)}_g {\varvec{\varLambda }}^{(k)}_g+ {\varvec{\gamma }}^{(k)}_g {\varvec{S}}_g^{(k+1)} {\varvec{\gamma }}^{(k)'}_g. \end{aligned}$$

(7)

Hence, the $g$th term of the expected complete-data log-likelihood $Q_2({\varvec{\theta }}_2; {\varvec{\theta }}^{(k+1/2)})$ is

$$\begin{aligned} Q_2({\varvec{\varLambda }}_g, {\varvec{\varPsi }}_g; {\varvec{\theta }}^{(k+1/2)})&= \text {C} \left( {\varvec{\theta }}_1^{(k+1)} \right) \!-\! \frac{1}{2} n_g^{(k+1)}\ln | {\varvec{\varPsi }}^{-1}_g| - \frac{1}{2} n_g^{(k+1)} \text {tr}\left( {\varvec{S}}_g^{(k+1)} {\varvec{\varPsi }}_g^{-1} \right) \nonumber \\&\quad + n_g^{(k+1)}\text {tr}\left( {\varvec{\varLambda }}_g {\varvec{\gamma }}_g^{(k)}{\varvec{S}}_g^{(k+1)} {\varvec{\varPsi }}^{-1}_g \right) \nonumber \\&\quad -\frac{1}{2} n_g^{(k+1)}\text {tr} \left( {\varvec{\varLambda }}_g'{\varvec{\varPsi }}_g^{-1} {\varvec{\varLambda }}_g {\varvec{\varTheta }}^{(k)}_g \right) , \end{aligned}$$

(8)

where $\text {C}({\varvec{\theta }}_1^{(k+1)})$ includes the terms in the complete-data log-likelihood that do not depend on ${\varvec{\theta }}_2$. Then, maximization of (8) with respect to ${\varvec{\varLambda }}_g$ and ${\varvec{\varPsi }}_g$ means that they must satisfy

$$\begin{aligned}&\frac{\partial Q_2}{\partial {\varvec{\varLambda }}_g} = n_g^{(k+1)} {\varvec{\varPsi }}_g^{-1} {\varvec{S}}_g^{(k+1)} {\varvec{\gamma }}_g^{(k)'} - n_g^{(k+1)} {\varvec{\varPsi }}_g^{-1} {\varvec{\varLambda }}_g {\varvec{\varTheta }}^{(k)}_g = {\varvec{0}}\\ \frac{\partial Q_2}{\partial {\varvec{\varPsi }}^{-1}_g}&\!=\! \frac{1}{2}n_g^{(k+1)}{\varvec{\varPsi }}_g-\frac{1}{2}n_g^{(k+1)}{\varvec{S}}_g^{(k+1)} \!+\! n_g^{(k+1)} {\varvec{S}}_g^{(k+1)'} {\varvec{\gamma }}_g^{(k)'} {\varvec{\varLambda }}_g'\\&\quad -\frac{1}{2} n_g^{(k+1)}{\varvec{\varLambda }}_g{\varvec{\varTheta }}_g^{(k)}{\varvec{\varLambda }}_g'= {\varvec{0}}. \end{aligned}$$

Therefore,

$$\begin{aligned}&\displaystyle {\varvec{S}}_g^{(k+1)} {\varvec{\gamma }}_g^{(k)'} - {\varvec{\varLambda }}_g {\varvec{\varTheta }}^{(k)}_g = {\varvec{0}}\end{aligned}$$

(9)

$$\begin{aligned}&\displaystyle {\varvec{\varPsi }}_g-{\varvec{S}}_g^{(k+1)} +2{\varvec{S}}_g^{(k+1)'} {\varvec{\gamma }}_g^{(k)'} {\varvec{\varLambda }}_g'-{\varvec{\varLambda }}_g{\varvec{\varTheta }}_g^{(k)}{\varvec{\varLambda }}_g' = {\varvec{0}}. \end{aligned}$$

(10)

From (9), we get ${\varvec{\varLambda }}_g^{(k+1)} = {\varvec{S}}^{(k+1)}_g {\varvec{\gamma }}^{(k)'}_g\left( {\varvec{\varTheta }}_g^{(k)}\right) ^{-1}$ and, substituting in (10), we obtain

$$\begin{aligned}&{\varvec{\varPsi }}_g^{(k+1)} -{\varvec{S}}_g^{(k+1)}+2 {\varvec{S}}_g^{(k+1)} {\varvec{\gamma }}^{(k)'}_g \left( {\varvec{S}}_g^{(k+1)}{\varvec{\gamma }}^{(k)'}_g{\varvec{\varTheta }}_g^{(k)-1} \right) '\\&\quad -\left( \mathbf {S}_g^{(k+1)}{\varvec{\gamma }}^{(k)'}_g{\varvec{\varTheta }}_g^{(k)-1}\right) {\varvec{\varTheta }}_g^{(k)}\left( \mathbf {S}_g^{(k+1)}{\varvec{\gamma }}_g^{(k)}{\varvec{\varTheta }}_g^{(k)-1} \right) ' = {\varvec{0}}, \end{aligned}$$

which yields $ {\varvec{\varPsi }}_g^{(k+1)} =\text {diag} ( {\varvec{S}}^{(k+1)}_g- {\varvec{\varLambda }}_g^{(k+1)} {\varvec{\gamma }}^{(k)'}_g{\varvec{S}}^{(k+1)}_g ). $

Appendix 2: Estimation of the parameters for the linear $t$ CWM with common $t$-factor analyzers using the AECM algorithm

Analogous to “Appendix 1”, the AECM algorithm consists of two cycles, with an E-step and a CM-step for each cycle, which are iterated until convergence is achieved.

1.1 First cycle

Here, ${\varvec{\theta }}_1=\left\{ \pi _g,{\varvec{\xi }}_g,{\varvec{\beta }}_g,\sigma ^2_g,v_g;g=1, \ldots , G\right\} $, where the missing data are the unobserved group labels $\varvec{z}_i$, $i=1,\ldots ,n$. The complete-data likelihood is

$$\begin{aligned} L_{c1}({\varvec{\theta }}_1)= & {} \prod _{i=1}^n\prod _{g=1}^G \Bigg [ \pi _g \phi \left( y_i|\varvec{x}_i,w_{ig};\mu (\varvec{x}_i;{\varvec{\beta }}_g), \displaystyle \frac{\sigma ^2_g}{w_{ig}}\right) \phi \left( \varvec{x}_i|w_{ig};{\varvec{\varLambda }}{\varvec{\xi }}_g,\displaystyle \frac{{\varvec{\varSigma }}_g}{w_{ig}}\right) \\&\times \,\kappa \left( w_{ig};\frac{\nu _g}{2},\frac{\nu _g}{2}\right) \Bigg ]^{z_{ig}}, \end{aligned}$$

where ${\varvec{\varSigma }}_g={\varvec{\varLambda }}{\varvec{\varOmega }}_g{\varvec{\varLambda }}'+{\varvec{\varPsi }}$.

Similar to “First cycle” section of Appendix 1, ${\mathbb {E}}_{{\varvec{\theta }}^{(k)}}\left[ Z_{ig}|{\mathcal {S}}\right] $ is calculated by replacing each $z_{ig}$ by $z_{ig}=\frac{\tau _{ig}}{\sum _{j=1}^G\tau _{ij}}$ where

$$\begin{aligned} \tau _{ig}= & {} \pi _g^{(k)} \phi \left( y_i|{\varvec{x}}_i,w_{ig};\mu (\varvec{x}_i;{\varvec{\beta }}_g^{(k)}), \frac{\sigma _g^{2(k)}}{w_{ig}}\right) \phi \left( {\varvec{x}}_i|w_{ig}; {\varvec{\varLambda }}^{(k)}{\varvec{\xi }}_g^{(k)},\frac{{\varvec{\varSigma }}_g^{(k)}}{w_{ig}} \right) \\&\times \kappa \left( w_{ig};\frac{\nu _g^{(k)}}{2},\frac{\nu _g^{(k)}}{2} \right) \end{aligned}$$

and ${\mathbb {E}}\left[ W_{ig}|{\mathcal {S}}\right] $ is computed by replacing each $w_{ig}$ by

$$\begin{aligned} w_{ig}^{(k+1)}&=(v_g^{(k)}+p+1)\Bigg [v_g^{(k)}+\left( {\varvec{x}}-{\varvec{\varLambda }}^{(k)} {\varvec{\xi }}_g^{(k)'}\right) \left( {\varvec{\varSigma }}_g^{(k)}\right) ^{-1} \left( {\varvec{x}}-{\varvec{\varLambda }}^{(k)}{\varvec{\xi }}_g^{(k)}\right) \\&\quad +\displaystyle \frac{\left( y-\beta _{0g}^{(k)}-{\varvec{\beta }}_{1g}^{(k)'}{\varvec{x}}\right) ^2}{\sigma _g^{2(k)}}\Bigg ]^{-1}, \end{aligned}$$

where ${\varvec{\varSigma }}_g^{(k)}={\varvec{\varLambda }}^{(k)}{\varvec{\varOmega }}_g^{(k)}{\varvec{\varLambda }}^{(k)'}+{\varvec{\varPsi }}^{(k)}$. For the M-step, the maximization of this complete-data log-likelihood yields

$$\begin{aligned} \pi _g^{(k+1)}&= \frac{1}{n}\displaystyle \sum _{i=1}^n z_{ig}^{(k+1)}, \qquad {\varvec{\xi }}_g^{(k+1)} = \frac{ \sum _{i=1}^n z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\varLambda }}^{(k)'}{\varvec{x}}_i }{\sum _{i=1}^n z_{ig}^{(k+1)}w_{ig}^{(k+1)} } \nonumber \\ {\varvec{\beta }}_{1g}^{(k+1)}&= \left[ \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}} \sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} y_i \left( {\varvec{x}}_i - {\varvec{\varLambda }}^{(k)}{\varvec{\xi }}_g^{(k+1)}\right) \right] \\&\quad \!\times \! \left[ \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}}\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} {\varvec{x}}'_i{\varvec{x}}_i \!-\!( {\varvec{\varLambda }}^{(k)}{\varvec{\xi }}_g^{(k+1)})'\left( {\varvec{\varLambda }}^{(k)}{\varvec{\xi }}_g^{(k+1)}\right) \right] ^{-1}\\ \beta _{0g}^{(k+1)}&= \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}}\displaystyle \sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)} y_i-{\varvec{\beta }}_{1g}^{(k+1)'}{\varvec{\varLambda }}^{(k)}{\varvec{\xi }}_g^{(k+1)}\\ \sigma _g^{2(k+1)}&= \frac{1}{\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}} \sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}\left[ y_i-\left( \beta _{0g}^{(k+1)}+{\varvec{\beta }}_{1g}^{(k+1)'}{\varvec{x}}_i\right) \right] ^2. \end{aligned}$$

As in “First cycle” section of Appendix 1, the degrees of freedom $v_g$ are updated according to (6), and we set ${\varvec{\theta }}^{(k+1/2)}=\left\{ {\varvec{\theta }}_1^{(k+1)},{\varvec{\theta }}_2^{(k)}\right\} $.

1.2 Second cycle

Here, ${\varvec{\theta }}_2=\left\{ {\varvec{\varSigma }}_g ; g=1, \ldots , G \right\} = \left\{ {\varvec{\varLambda }},{\varvec{\varOmega }}_g,{\varvec{\varPsi }};g=1, \ldots , G\right\} $, where the missing data are the unobserved group labels $\varvec{z}_i$ and the latent factors $\varvec{u}_{ig}$, $i=1,\ldots ,n$ and $g=1,\ldots ,G$. Therefore, the complete-data likelihood is

$$\begin{aligned} L_{c2}({\varvec{\theta }}_2)&= \prod _{i=1}^n\prod _{g=1}^G \Bigg [ \pi _g^{(k+1)} \phi \left( y_i|{\varvec{x}}_i,w_{ig};\mu ({\varvec{x}}_i;{\varvec{\beta }}_g^{(k+1)}),\frac{\sigma _g^{2(k+1)}}{w_{ig}}\right) \\&\times \, \phi \left( {\varvec{x}}_i|{\varvec{u}}_{ig},w_{ig};{\varvec{\varLambda }}{\varvec{u}}_{ig},\frac{{\varvec{\varPsi }}}{w_{ig}}\right) \phi \left( {\varvec{u}}_{ig}|w_{ig};{\varvec{\xi }}_g^{(k+1)},\frac{{\varvec{\varOmega }}_g}{w_{ig}} \right) \kappa \left( w_{ig};\frac{\nu _g^{(k)}}{2},\frac{\nu _g^{(k)}}{2} \right) \Bigg ]^{z_{ig}} \end{aligned}$$

because $Y$ is conditionally independent of ${\varvec{U}}$ given ${\varvec{X}}={\varvec{x}}$. The E-step on the second cycle of the $(k+1)$th iteration requires the calculation of $Q_2({\varvec{\theta }}_2; {\varvec{\theta }}^{(k+1/2)})={\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left[ l_{c2}({\varvec{\theta }}_2)|{\mathcal {S}}\right] $. Based on (5), the expectations involved in calculating $Q_2({\varvec{\theta }}_2; {\varvec{\theta }}^{(k+1/2)})$ are given by

$$\begin{aligned}&{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} \left[ Z_{ig}W_{ig} ({\varvec{X}}_{ig}-{\varvec{\varLambda }}{\varvec{U}}_{ig})({\varvec{X}}_{ig}-{\varvec{\varLambda }}{\varvec{U}}_{ig})'| {\mathcal {S}}\right] = z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{x}}_i{\varvec{x}}_i' \nonumber \\&\quad \quad -{\varvec{\varLambda }}{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} (Z_{ig}W_{ig} {\varvec{U}}_{ig}|{\mathcal {S}}){\varvec{x}}'_i -{\varvec{x}}_i{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} (Z_{ig}W_{ig}{\varvec{U}}_{ig}'|{\mathcal {S}}){\varvec{\varLambda }}' \nonumber \\&\quad \quad +{\varvec{\varLambda }}{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} (Z_{ig}W_{ig}{\varvec{U}}_{ig}{\varvec{U}}'_{ig}|{\mathcal {S}}){\varvec{\varLambda }}', \end{aligned}$$

(11)

$$\begin{aligned}&{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}(Z_{ig}W_{ig}{\varvec{U}}_{ig}|{\mathcal {S}}) = z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\eta }}_{ig}^{(k+1/2)}, \end{aligned}$$

(12)

$$\begin{aligned}&{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left[ Z_{ig}W_{ig} ({\varvec{U}}_{ig}-{\varvec{\xi }}_g)|{\mathcal {S}}\right] = z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\gamma }}_g^{(k)'}{\varvec{\delta }}_{ig}^{(k+1/2)}, \end{aligned}$$

(13)

$$\begin{aligned} {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}(Z_{ig}W_{ig}{\varvec{U}}_{ig}{\varvec{U}}_{ig}'|{\mathcal {S}})= & {} z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\eta }}_{ig}^{(k+1/2)}{\varvec{\eta }}_{ig}^{(k+1/2)'} \nonumber \\&z_{ig}^{(k+1)} \left[ \left( {\varvec{I}}_q-{\varvec{\gamma }}_g^{(k)'}{\varvec{\varLambda }}^{(k)}\right) {\varvec{\varOmega }}_g^{(k)}\right] , \end{aligned}$$

(14)

$$\begin{aligned} {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left[ Z_{ig}W_{ig}({\varvec{U}}_{ig}-{\varvec{\xi }}_g)({\varvec{U}}_{ig}-{\varvec{\xi }}_g)'\big | {\mathcal {S}}\right]= & {} {\varvec{\gamma }}_g^{(k)'}\left[ z_{ig}^{(k+1)}w_{ig}^{(k+1)} {\varvec{\delta }}_{ig}^{(k+1/2)}{\varvec{\delta }}_{ig}^{(k+1/2)'}\right] {\varvec{\gamma }}_g^{(k)} \nonumber \\&+z_{ig}^{(k+1)}\left( {\varvec{I}}_q-{\varvec{\gamma }}_g^{(k)'}{\varvec{\varLambda }}^{(k)}\right) {\varvec{\varOmega }}_g^{(k)}, \end{aligned}$$

(15)

where

$$\begin{aligned} {\varvec{\delta }}_{ig}^{(k+1/2)}= & {} {\varvec{x}}_i - {\varvec{\varLambda }}^{(k)}{\varvec{\xi }}_g^{(k+1)},\\ {\varvec{\eta }}_{ig}^{(k+1/2)}= & {} {\varvec{\gamma }}_g^{(k)'}{\varvec{\delta }}_{ig}^{(k+1/2)} +{\varvec{\xi }}_g^{(k+1)}, \end{aligned}$$

and

$$\begin{aligned} {\varvec{\gamma }}^{(k)}_g = \left( {\varvec{\varLambda }}^{(k)}{\varvec{\varOmega }}_g^{(k)}{\varvec{\varLambda }}^{(k)'}+{\varvec{\varPsi }}^{(k)}\right) ^{-1} {\varvec{\varLambda }}^{(k)}{\varvec{\varOmega }}_g^{(k)}. \end{aligned}$$

Alternatively, the expectation in (11 12 13) can be written as

$$\begin{aligned}&{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left\{ Z_{ig}W_{ig}\left[ {\varvec{X}}_{ig}-{\varvec{\varLambda }}{\varvec{\xi }}_g -({\varvec{\varLambda }}{\varvec{U}}_{ig}\!-\!{\varvec{\varLambda }}{\varvec{\xi }}_g)\right] \left[ {\varvec{X}}_{ig}\!-\!{\varvec{\varLambda }}{\varvec{\xi }}_g\!-\!({\varvec{\varLambda }}{\varvec{U}}_{ig}\!-\!{\varvec{\varLambda }}{\varvec{\xi }}_g)\right] ' \Big |{\mathcal {S}}\right\} \nonumber \\&\quad \!=\!+z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\delta }}_{ig}^{(k+1/2)}{\varvec{\delta }}_{ig}^{(k+1/2)'} -{\varvec{\delta }}_{ig}^{(k+1/2)} {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} \left[ Z_{ig}W_{ig} ({\varvec{U}}_{ig}-{\varvec{\xi }}_g)\big | {\mathcal {S}} \right] {\varvec{\varLambda }}^{(k)'}\nonumber \\&\quad \quad -{\varvec{\varLambda }}^{(k)}{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left[ Z_{ig}W_{ig} ({\varvec{U}}_{ig} -{\varvec{\xi }}_g)\big |{\mathcal {S}}\right] {\varvec{\delta }}_{ig}^{(k+1/2)'}\nonumber \\&\quad \quad +{\varvec{\varLambda }}^{(k)}{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left[ Z_{ig}W_{ig} ({\varvec{U}}_{ig}-{\varvec{\xi }}_g)\big |{\mathcal {S}}\right] {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left[ Z_{ig}W_{ig}({\varvec{U}}_{ig}-{\varvec{\xi }}_g)'\Big | {\mathcal {S}}\right] {\varvec{\varLambda }}^{(k)'}.\nonumber \\ \end{aligned}$$

(16)

Then, the $g$th term of the expected complete-data log-likelihood $Q_2({\varvec{\theta }}_2; {\varvec{\theta }}^{(k+1/2)})$ becomes

$$\begin{aligned}&Q_2\left( {\varvec{\varLambda }},{\varvec{\varPsi }},{\varvec{\varOmega }}_g;{\varvec{\theta }}^{(k+1/2)}\right) = \text {C}\left( {\varvec{\theta }}_1^{(k+1)}\right) +\frac{1}{2} n_g^{(k+1)}\ln \left| {\varvec{\varPsi }}^{-1}\right| + \frac{1}{2}n_g^{(k+1)}\ln \left| {\varvec{\varOmega }}_g^{-1}\right| \nonumber \\&\quad -\frac{1}{2}\text {tr}\left\{ \sum _{g=1}^G\sum _{i=1}^n{\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} \left[ Z_{ig}W_{ig}({\varvec{X}}_{ig}-{\varvec{\varLambda }}{\varvec{U}}_{ig}) ({\varvec{X}}_{ig}-{\varvec{\varLambda }}{\varvec{U}}_{ig})'|{\mathcal {S}}\right] {\varvec{\varPsi }}^{-1}\right\} \nonumber \\&\quad -\frac{1}{2}\text {tr}\left\{ \sum _{i=1}^n {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}} \left[ Z_{ig}W_{ig}\left( {\varvec{U}}_{ig}-{\varvec{\xi }}_g^{(k+1)}\right) \left( {\varvec{U}}_{ig}-{\varvec{\xi }}_g^{(k+1)}\right) '\Big | {\mathcal {S}}\right] {\varvec{\varOmega }}_g^{-1}\right\} , \end{aligned}$$

where $\text {C}({\varvec{\theta }}_1^{(k+1)})$ denotes the terms in the complete-data log-likelihood that do not depend on ${\varvec{\theta }}_2$. The expected complete-data log-likelihood from the second cycle is maximized for ${\varvec{\varOmega }}_g$ by satisfying

$$\begin{aligned} \frac{\partial Q_2}{\partial {\varvec{\varOmega }}_g^{-1}} \!=\! \frac{1}{2} n_g^{(k+1)} {\varvec{\varOmega }}_g \!-\!\frac{1}{2} n_g^{(k+1)} \left\{ {\mathbb {E}}_{{\varvec{\theta }}^{(k+1/2)}}\left[ Z_{ig}W_{ig}({\varvec{U}}_{ig}\!-\!{\varvec{\xi }}_g)({\varvec{U}}_{ig}\!-\!{\varvec{\xi }}_g)' \Big | {\mathcal {S}}\right] \right\} ' \!=\! {\varvec{0}},\nonumber \\ \end{aligned}$$

(17)

which yields

$$\begin{aligned} {\varvec{\varOmega }}_g^{(k+1)}={\varvec{\gamma }}_g^{(k)'}{\varvec{S}}_g^{(k+1)}{\varvec{\gamma }}_g^{(k)} +{\varvec{\varOmega }}_g^{(k)} \left( {\varvec{I}}_q-{\varvec{\varLambda }}^{(k)'}{\varvec{\gamma }}_g^{(k)} \right) , \end{aligned}$$

where

$$\begin{aligned} {\varvec{S}}_g^{(k+1)} = \frac{1}{n_g^{(k+1)}}\sum _{i=1}^n z_{ig}^{(k+1)} w_{ig}^{(k+1)}{\varvec{\delta }}_{ig}^{(k+1/2)} {\varvec{\delta }}_{ig}^{(k+1/2)'}. \end{aligned}$$

The expected complete-data log-likelihood from the second cycle is maximized for ${\varvec{\varPsi }}$ using (16) and satisfies

$$\begin{aligned} \frac{\partial Q_2}{\partial {\varvec{\varPsi }}^{-1}}&= \sum _{g=1}^G \Bigg (\frac{1}{2}n_g^{(k+1)}{\varvec{\varPsi }}-\frac{1}{2}n_g^{(k+1)}{\varvec{S}}_g^{(k+1)} + \frac{1}{2}n_g^{(k+1)} {\varvec{S}}_g^{(k+1)} {\varvec{\gamma }}_g^{(k)} {\varvec{\varLambda }}^{(k)'}\\&\quad \quad +\frac{1}{2}n_g^{(k+1)} {\varvec{S}}_g^{(k+1)} {\varvec{\varLambda }}^{(k)}{\varvec{\gamma }}_g^{(k)'} \Bigg ) \\&= -\frac{1}{2}\sum _{g=1}^G n_g^{(k+1)}{\varvec{\varLambda }}^{(k)}\left[ {\varvec{\gamma }}_g^{(k)'}{\varvec{S}}_g^{(k+1)}{\varvec{\gamma }}_g^{(k)} +\left( {\varvec{I}}_q-{\varvec{\gamma }}_g^{(k)'}{\varvec{\varLambda }}^{(k)}\right) {\varvec{\varOmega }}_g^{(k)}\right] ' {\varvec{\varLambda }}^{(k)'} = {\varvec{0}}. \end{aligned}$$

This yields

$$\begin{aligned} {\varvec{\varPsi }}^{(k+1)}&=\text {diag}\Bigg (\frac{1}{n}\sum _{g=1}^G n_g^{(k+1)}\Bigg [ \left( {\varvec{\varLambda }}^{(k)}{\varvec{\gamma }}_g^{(k)'}-{\varvec{I}}_p\right) {\varvec{S}}_g^{(k+1)} \left( {\varvec{\varLambda }}^{(k)}{\varvec{\gamma }}_g^{(k)'}-{\varvec{I}}_p\right) '\\&+{\varvec{\varLambda }}^{(k)}{\varvec{\varOmega }}_g^{(k)} \left( {\varvec{I}}_q-{\varvec{\varLambda }}^{(k)'}{\varvec{\gamma }}_g^{(k)} \right) {\varvec{\varLambda }}^{(k)'}\Bigg ]\Bigg ) \end{aligned}$$

The expected complete-data log-likelihood from the second cycle is maximized for ${\varvec{\varLambda }}$ using (11 12 13) and satisfies

$$\begin{aligned} \frac{\partial Q_2}{\partial {\varvec{\varLambda }}}= & {} -\sum _{g=1}^G\sum _{i=1}^nz_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{x}}_i{\varvec{\eta }}_{ig}^{(k+1/2)'} +{\varvec{\varLambda }}\sum _{g=1}^G\sum _{i=1}^n z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\eta }}_{ig}^{(k+1/2)} {\varvec{\eta }}_{ig}^{(k+1/2)'}\\&+{\varvec{\varLambda }}\sum _{g=1}^G\sum _{i=1}^nz_{ig}^{(k+1)} \left[ \left( {\varvec{I}}_q-{\varvec{\gamma }}_g^{(k)'}{\varvec{\varLambda }}^{(k)}\right) {\varvec{\varOmega }}_g^{(k)} \right] = {\varvec{0}}, \end{aligned}$$

which yields

$$\begin{aligned} {\varvec{\varLambda }}^{(k+1)}&= \left( \sum _{g=1}^G\sum _{i=1}^nz_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{x}}_i{\varvec{\eta }}_{ig}^{(k+1/2)'} \right) \\&\!\times \!\, \left( \sum _{g=1}^G\sum _{i=1}^n z_{ig}^{(k+1)}w_{ig}^{(k+1)}{\varvec{\eta }}_{ig}^{(k+1/2)} {\varvec{\eta }}_{ig}^{(k+1/2)'}+ \sum _{g=1}^Gn_g^{(k+1)} ({\varvec{I}}_q-{\varvec{\gamma }}_g^{(k)'}{\varvec{\varLambda }}^{(k)}){\varvec{\varOmega }}_g^{(k)} \right) ^{-1}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Subedi, S., Punzo, A., Ingrassia, S. et al. Cluster-weighted $t$-factor analyzers for robust model-based clustering and dimension reduction. Stat Methods Appl 24, 623–649 (2015). https://doi.org/10.1007/s10260-015-0298-7

Download citation

Accepted: 01 February 2015
Published: 01 March 2015
Issue Date: November 2015
DOI: https://doi.org/10.1007/s10260-015-0298-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Cluster-weighted \(t\)-factor analyzers for robust model-based clustering and dimension reduction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Factor and hybrid components for model-based clustering

Advances in Robust Constrained Model Based Clustering

Gaussian parsimonious clustering models with covariates and a noise component

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 195 KB)

Appendices

Appendix 1: Estimation of the parameters for the linear \(t\) CWM with \(t\)-factor analyzers using the AECM algorithm

1.1 First cycle

1.2 Second cycle

Appendix 2: Estimation of the parameters for the linear \(t\) CWM with common \(t\)-factor analyzers using the AECM algorithm

1.1 First cycle

1.2 Second cycle

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Cluster-weighted \(t\)-factor analyzers for robust model-based clustering and dimension reduction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Factor and hybrid components for model-based clustering

Advances in Robust Constrained Model Based Clustering

Gaussian parsimonious clustering models with covariates and a noise component

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 195 KB)

Appendices

Appendix 1: Estimation of the parameters for the linear \(t\) CWM with \(t\)-factor analyzers using the AECM algorithm

1.1 First cycle

1.2 Second cycle

Appendix 2: Estimation of the parameters for the linear \(t\) CWM with common \(t\)-factor analyzers using the AECM algorithm

1.1 First cycle

1.2 Second cycle

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now