Abstract
Factor Analysis (FA) is a well established probabilistic approach to unsupervised learning for complex systems involving correlated variables in high-dimensional spaces. FA aims principally to reduce the dimensionality of the data by projecting high-dimensional vectors on to lower-dimensional spaces. However, because of its inherent linearity, the generic FA model is essentially unable to capture data complexity when the input space is nonhomogeneous. A finite Mixture of Factor Analysers (MFA) is a globally nonlinear and therefore more flexible extension of the basic FA model that overcomes the above limitation by combining the local factor analysers of each cluster of the heterogeneous input space. The structure of the MFA model offers the potential to model the density of high-dimensional observations adequately while also allowing both clustering and local dimensionality reduction. Many aspects of the MFA model have recently come under close scrutiny, from both the likelihood-based and the Bayesian perspectives. In this paper, we adopt a Bayesian approach, and more specifically a treatment that bases estimation and inference on the stochastic simulation of the posterior distributions of interest. We first treat the case where the number of mixture components and the number of common factors are known and fixed, and we derive an efficient Markov Chain Monte Carlo (MCMC) algorithm based on Data Augmentation to perform inference and estimation. We also consider the more general setting where there is uncertainty about the dimensionalities of the latent spaces (number of mixture components and number of common factors unknown), and we estimate the complexity of the model by using the sample paths of an ergodic Markov chain obtained through the simulation of a continuous-time stochastic birth-and-death point process. The main strengths of our algorithms are that they are both efficient (our algorithms are all based on familiar and standard distributions that are easy to sample from, and many characteristics of interest are by-products of the same process) and easy to interpret. Moreover, they are straightforward to implement and offer the possibility of assessing the goodness of the results obtained. Experimental results on both artificial and real data reveal that our approach performs well, and can therefore be envisaged as an alternative to the other approaches used for this model.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Arminger, G., Stein, P., &; Wittenberg, J. (1999). Mixtures of conditional mean and covariance structure models. Psychometrika, 64, {pp475-494.
Baddeley, A. J. (1994). Discussion on representation of knowledge in complex systems by Grenander and Miller. Journal of the Royal Statistical Society, Series B, 56, 584-585.
Barndorff-Nielsen, O. E., Kendall, W. S., &; van Lieshout, M. N. M. (1999). Stochastic geometry: Likelihood and computation. Monographs on Statistics and Applied Probability. London: Chapman and Hall.
Bartholomew, D. J. (1987). Latent variable models and factor analysis, Griffin's Statistical Monographs and Courses. Charles Griffin &; Company Limited.
Celeux, G. (1998). Bayesian inference for mixtures: The label switching problem. In J. Payne, &; P. Green (Eds.), Proceedings in Computational Statistics 1998. pp. 227-232, Physica-Verlag.
Celeux, G., Hurn, M., &; Robert, C. P. (2000). Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association, 95, 957-970.
Diebolt, J., &; Robert, C. P. (1994). Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society, Series B, 56, 363-375.
Dolan, C. V., &; Van der Maas, H. L. J. (1998). Fitting multivariate normal finite mixtures subject to structural equation modelling. Psychometrika, 63, 227-253.
Dong, K. K., &; Taylor, J. M. G. (1995). The restricted EM algorithm for maximum likelihood estimation under linear restrictions on the parameters. Journal of the American Statistical Association, 90, 707-716.
Escobar, M. D., &; West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90, 577-588.
Everitt, B. S. (1984). An introduction to latent variable models. Monographs on Statistics and Applied Probability. Chapman and Hall.
Everitt, B. S., &; Hand, D. J. (1981). Finite mixture distributions. Monographs on Statistics and Applied Probability. Chapman and Hall.
Fokoué, E. (2000). A Markov chain Monte Carlo (MCMC) approach to the Bayesian analysis of mixtures of factor analysers. In W. Jansen, &; J. G. Bethlehem (Eds.), Proceedings in Computational Statistics 2000, Short Communication and Posters (pp. 29-30), Statistics The Netherlands.
Fokoué, E., &; Titterington, D. M. (2000a). Bayesian sampling for mixtures of factor analysers. Technical Report No. 00-3, Department of Statistics, University of Glasgow, Glasgow, G12 8QW, Scotland, UK.
Fokoué, E., &; Titterington, D. M. (2000b). Mixtures of factor analysers with fixed observed covariates. Technical Report No. 00-4, Department of Statistics, University of Glasgow, Glasgow, G12 8QW, Scotland, UK.
Fokoué, E., &; Titterington, D. M. (2000c). Stochastic model selection for mixtures of factor analysers. Technical Report No. 00-5, Department of Statistics, University of Glasgow, Glasgow, G12 8QW, Scotland, UK.
Gelman, A., Meng, S. L., &; Stern, H. (1996). Posterior predictive assessment of model fitness via realised discrepancies. Statistica Sinica, 6, 733-759.
Ghahramani, Z., &; Hinton, G. E. (1997). The EM algorithm for mixtures of factor analysers. Technical Report CRG-TR-96-1, Department of Computer Science, University of Toronto, 6 King's College Road, Toronto, Canada, M5S 1A4.
Ghahramani, Z., &; Beal, M. (2000). Variational inference for Bayesian mixture of factor analysers. In S. A. Solla, T. K. Leen, &; K.-R. Müller (Eds.) Advanced in neural information processing systems 12. Cambridge, MA: MIT Press.
Gilks, W. R., Richardson, S., &; Spiegelhalter, D. J. (1996). Markov chain Monte Carlo in practice. Interdisciplinary Statistics. Chapman and Hall.
Grenander, U., &; Miller, M. I. (1994). Representation of knowledge in complex systems. Journal of the Royal Statistical Society, Series B, 56, 549-603.
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711-732.
Hurn, M., Justel, A., &; Robert, C. P. (2000). Estimating mixtures of regressions. Technical Report, Department of Mathematical Sciences, University of Bath, UK.
Krzanowski, W. J., &; Marriott, F. H. C. (1994). Multivariate analysis,Part 1: Distribution, ordination and inference, Kendall's Library of Statistics 1. Arnorld.
Krzanowski, W. J., &; Marriott, F. H. C. (1995). Multivariate analysis, Part 2: Classification, covariance structures and repeated measurements, Kendall's Library of Statistics 2. Arnorld.
van Lieshout, M. N. M. (1994). Discussion on representation of knowledge in complex systems by Grenander and Miller. Journal of the Royal Statistical Society, Series B, 56, 585.
Lopes, H. F., &; West, M. (1999). Model uncertainty in factor analysis. Technical Report, Institute of Statistics and Decision Sciences, Duke University, USA.
McLachlan, G., &; Peel, D. (2000). Finite mixture models, Wiley Series in Probability and Mathematical Statistics. John Wiley &; Sons.
Phillips, D. B., &; Smith, A. F. M. (1996). Bayesian model comparison via jump diffusions. In W. R. Gilks, S. Richardson, &; D. J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice. Interdisciplinary Statistics (ch. 13, pp. 215-239). Chapman and Hall.
Press, S. J. (1972). Applied multivariate analysis. Holt, Rinehart and Winston.
Richardson, S., &; Green, P. J. (1997). On the Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society, Series B, 59, 731-792.
Robert, C. P. (1996a). Mixtures of distributions: Inference and estimation. In W. R. Gilks, S. Richardson, &; D. J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice, Interdisciplinary Statistics (ch. 24, pp. 441-464). Chapman &; Hall.
Robert, C. P. (1996b).Méthodes de Monte Carlo par Chaines de Markov, Statistique Mathématique et Probabilité. Economica.
Robert, C. P., &; Casella, G. (1999). Monte Carlo statistical methods. Springer.
Stephens, M. (2000a). Dealing with label switching in mixture models. Journal of the Royal Statistical Society, Series B, 62, 795-809.
Stephens, M. (2000b). Bayesian analysis of mixtures models with an unknown number of components-an alternative to reversible jump methods. Annals of Statistics, 28, 40-74.
Stoyan, D., Kendall, W. S., &; Mecke, J. (1995). Stochastic geometry and its applications, 2nd edn., Wiley Series in Probability and Statistics. John Wiley and Sons.
Tanner, M. A., &; Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82, 529-550.
Tipping, M., &; Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society B, 61, 611-622.
Titterington, D. M., Smith, A. F. M., &; Makov, U. E. (1985). Statistical analysis of finite mixture distributions, Wiley Series in Probability and Mathematical Statistics. John Wiley &; Sons.
Ueda, N., Nakano, R., Ghahramani, Z., &; Hinton, G. E. (2000). SMEM for mixture models. Neural Computation, 12, 2109-2128.
Utsugi, A., &; Kumagai, T. (2001). Bayesian analysis of mixture of factor analysers. Neural Computation, 13, 993-1002.
Yung, Y. F. (1997). Finite mixtures in confirmatory factor analysis models. Psychometrika, 62, 297-330.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Fokoué, E., Titterington, D. Mixtures of Factor Analysers. Bayesian Estimation and Inference by Stochastic Simulation. Machine Learning 50, 73–94 (2003). https://doi.org/10.1023/A:1020297828025
Issue Date:
DOI: https://doi.org/10.1023/A:1020297828025