Mixture model averaging for clustering

Wei, Yuhong; McNicholas, Paul D.

doi:10.1007/s11634-014-0182-6

Mixture model averaging for clustering

Regular Article
Published: 26 August 2014

Volume 9, pages 197–217, (2015)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Yuhong Wei¹ &
Paul D. McNicholas²

718 Accesses
15 Citations
2 Altmetric
Explore all metrics

Abstract

In mixture model-based clustering applications, it is common to fit several models from a family and report clustering results from only the ‘best’ one. In such circumstances, selection of this best model is achieved using a model selection criterion, most often the Bayesian information criterion. Rather than throw away all but the best model, we average multiple models that are in some sense close to the best one, thereby producing a weighted average of clustering results. Two (weighted) averaging approaches are considered: averaging component membership probabilities and averaging models. In both cases, Occam’s window is used to determine closeness to the best model and weights are computed within a Bayesian model averaging paradigm. In some cases, we need to merge components before averaging; we introduce a method for merging mixture components based on the adjusted Rand index. The effectiveness of our model-based clustering averaging approaches is illustrated using a family of Gaussian mixture models on real and simulated data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

The parsimonious Gaussian mixture models with partitioned parameters and their application in clustering

Article 25 January 2024

Gaussian mixture modeling and model-based clustering under measurement inconsistency

Article 12 May 2020

A robust model-based clustering based on the geometric median and the median covariation matrix

Article 20 December 2023

References

Anderson E (1935) The irises of the Gaspé peninsula. Bull Am Iris Soc 59:2–5
Google Scholar
Andrews JL, McNicholas PD (2011) Extending mixtures of multivariate t-factor analyzers. Stat Comput 21(3):361–373
MathSciNet Google Scholar
Andrews JL, McNicholas PD, Subedi S (2011) Model-based classification via mixtures of multivariate t-distributions. Comput Stat Data Anal 55(1):520–529
MATH MathSciNet Google Scholar
Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821
Basford KE, McLachlan GJ (1985) Estimation of allocation rates in a cluster analysis context. J Am Stat Assoc 80(390):286–293
MathSciNet Google Scholar
Baudry J-P, Raftery AE, Celeux G, Lo K, Gottardo R (2010) Combining mixture components for clustering. J Comput Graph Stat 19(2):332–353
MathSciNet Google Scholar
Bhattacharya S, McNicholas PD (2014) A LASSO-penalized BIC for mixture model selection. Adv Data Anal Classif 8(1):45–61
MathSciNet Google Scholar
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725
Google Scholar
Bouveyron C, Girard S, Schmid C (2007) High-dimensional data clustering. Comput Stat Data Anal 52(1):502–519
MATH MathSciNet Google Scholar
Browne RP, McNicholas PD (2013) Mixture: mixture models for clustering and classification. R package version 1.0
Browne RP, McNicholas PD (2014) Estimating common principal components in high dimensions. Adv Data Anal Classif 8(2):217–226
MathSciNet Google Scholar
Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28(5):781–793
Google Scholar
Dahl DB (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In: Do K-A, Müller P, Vannucci M (eds) Bayesian inference for gene expression and proteomics. Cambridge University Press, New York
Dasgupta A, Raftery AE (1998) Detecting features in spatial point processes with clutter via model-based clustering. J Am Stat Assoc 93:294–302
MATH Google Scholar
Dean N, Murphy TB, Downey G (2006) Using unlabelled data to update classification rules with applications in food authenticity studies. J R Stat Soc: Ser C 55(1):1–14
MATH MathSciNet Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc: Ser B 39(1):1–38
MATH MathSciNet Google Scholar
Faraway J (2011) Faraway: functions and datasets for books by Julian Faraway. R package version 1.0.5
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Google Scholar
Flury B (1997) A first course in multivariate statistics. Springer, New York
Book MATH Google Scholar
Flury B (2012) Flury: data sets from flury, 1997. R package version 0.1-3
Forina M, Armanino C, Castino M, Ubigli M (1986) Multivariate data analysis as a discriminating method of the origin of wines. Vitis 25:189–201
Google Scholar
Fraley C, Raftery AE, Murphy TB, Scrucca L (2012) mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation. Technical Report 597, Department of Statistics, University of Washington, Seattle, WA
Fraley C, Raftery AE, Scrucca L (2013) mclust: normal mixture modeling for model-based clustering, classification, and density estimation. R package version 4.2
Franczak BC, Browne RP, McNicholas PD (2014) Mixtures of shifted asymmetric Laplace distributions. IEEE Trans Pattern Anal Mach Intell 36(6):1149–1157
Google Scholar
Fred ALN, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27:835–850
Google Scholar
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. J R Stat Soc: Ser B 58:155–176
MATH MathSciNet Google Scholar
Hennig C (2010) Methods for merging Gaussian mixture components. Adv Data Anal Classif 4:3–34
MATH MathSciNet Google Scholar
Hjort NL, Claeskens G (2003) Frequentist model average estimators. J Am Stat Assoc 98(464):879–899
MATH MathSciNet Google Scholar
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: A tutorial. Stat Sci 14(4):382–401
MATH MathSciNet Google Scholar
Hoeting JA, Raftery AE, Madigan D (1999) Bayesian simultaneous variable and transformation selection in linear regression. Technical Report 9905, Department of Statistics, Colorado State University
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Google Scholar
Hunter DR, Lange K (2004) A tutorial on MM algorithms. Am Stat 58:30–37
MathSciNet Google Scholar
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795
MATH Google Scholar
Keribin C (2000) Consistent estimation of the order of mixture models. Sankhyā Indian J Stat Ser A 62(1):49–66
MATH MathSciNet Google Scholar
Krivitsky PN, Handcock MS, Raftery AE, Hoff PD (2009) Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models. Soc Netw 31(3):204–213
Google Scholar
Leroux BG (1992) Consistent estimation of a mixing distribution. Ann Stat 1992:1350–1360
MathSciNet Google Scholar
Madigan D, Raftery AE (1994) Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc 89:1535–1546
MATH Google Scholar
Mangasarian OL, Street WN, Wolberg WH (1995) Breast cancer diagnosis and prognosis via linear programming. Oper Res 43:570–577
MATLAB (2011). version 7.12.0.635 (R2011a). Natick, Massachusetts: The MathWorks Inc.
McNicholas PD (2010) Model-based classification using latent Gaussian mixture models. J Stat Plan Inference 140(5):1175–1181
MATH MathSciNet Google Scholar
McNicholas PD, Browne RP (2013) Discussion of How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification. J R Stat Soc: Ser C 62(3):352–353
Google Scholar
McNicholas PD, Jampani KR, McDaid AF, Murphy TB, Banks L (2014) pgmm: Parsimonious Gaussian Mixture Models. R package version 1.1
McNicholas PD, Murphy TB (2008) Parsimonious Gaussian mixture models. Stat Comput 18(3):285–296
MathSciNet Google Scholar
McNicholas PD, Murphy TB (2010) Model-based clustering of microarray expression data via latent Gaussian mixture models. Bioinformatics 26(21):2705–2712
Article Google Scholar
Milligan GW, Cooper MC (1986) A study of the comparability of external criteria for hierarchical cluster analysis. Multivar Behav Res 21(4):441–458
Google Scholar
Molitor J, Papathomas M, Jerrett M, Richardson S (2010) Bayesian profile regression with an application to the national survey of children’s health. Biostatistics 11(3):484–498
Article Google Scholar
Murray PM, Browne RB, McNicholas PD (2014) Mixtures of skew-t factor analyzers. Comput Stat Data Anal 77:326–335
MathSciNet Google Scholar
Qiu W, Joe H (2006) Generation of random clusters with specified degree of separation. J Classif 23:315–334
MathSciNet Google Scholar
Qiu W, Joe H (2012) ClusterGeneration: random cluster generation (with specified degree of separation). R package version 1.2.9
R Core Team (2013) R: A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria
Raftery AE (1996) Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika 83(2):251–266
Article MATH MathSciNet Google Scholar
Raftery AE, Madigan D, Hoeting JA (1998) Bayesian model averaging for linear regression models. J Am Stat Assoc 92:179–191
MathSciNet Google Scholar
Raftery AE, Madigan D, Volinsky CT (1995) Accounting for model uncertainty in survival analysis improves predictive performance (with discussion). In: Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian Statistics, vol 5. Oxford University Press, Oxford, pp 323–349
Google Scholar
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
MATH Google Scholar
Steinley D (2004) Properties of the Hubert-Arabie adjusted Rand index. Psychol Methods 9:386–396
Google Scholar
Stephens M (2000) Dealing with label switching in mixture models. J R Stat Soc: Ser B 62:795–809
MATH MathSciNet Google Scholar
Strehl A, Ghosh J, Cardie C (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Google Scholar
Volinsky CT, Madigan D, Raftery AE, Kronmal RA (1997) Bayesian model averaging in proportional hazard models: Assessing the risk of a stroke. J R Stat Soc: Ser C 46(4):433–448
MATH Google Scholar
Vrbik I, McNicholas PD (2014) Parsimonious skew mixture models for model-based clustering and classification. Comput Stat Data Anal 71:196–210
MathSciNet Google Scholar
Wehrens R, Buydens LM, Fraley C, Raftery AE (2004) Model-based clustering for image segmentation and large datasets via sampling. J Classif 21:231–253
MATH MathSciNet Google Scholar
Wolfe JH (1963) Object cluster analysis of social areas. Master’s thesis, University of California, Berkeley
Yeung KY, Fraley C, Murua A, Raftery AE, Ruzzo WL (2001) Model-based clustering and data transformations for gene expression data. Bioinformatics 17(10):977–987

Download references

Acknowledgments

The authors gratefully acknowledge the very helpful comments and suggestions of an associate editor and three anonymous reviewers. The authors are grateful to Professor Adrian Raftery and other members of the University of Washington Working Group on Model-Based Clustering for their comments and suggestions on an earlier version of this work.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Guelph, Guelph, ON, N1G 2W1, Canada
Yuhong Wei
Department of Mathematics and Statistics, McMaster University, Hamilton, ON, L8S 4L8, Canada
Paul D. McNicholas

Authors

Yuhong Wei
View author publications
You can also search for this author in PubMed Google Scholar
Paul D. McNicholas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul D. McNicholas.

Additional information

This work was supported by an Ontario Graduate Scholarship, an Early Researcher Award from the Ontario Ministry of Research and Innovation, a grant-in-aid from Compusense Inc., and a Collaborative Research and Development Grant from the Natural Sciences and Engineering Research Council of Canada.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Y., McNicholas, P.D. Mixture model averaging for clustering. Adv Data Anal Classif 9, 197–217 (2015). https://doi.org/10.1007/s11634-014-0182-6

Download citation

Received: 24 June 2013
Revised: 26 July 2014
Accepted: 07 August 2014
Published: 26 August 2014
Issue Date: June 2015
DOI: https://doi.org/10.1007/s11634-014-0182-6

Keywords

Mathematics Subject Classification

62H30

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Mixture model averaging for clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The parsimonious Gaussian mixture models with partitioned parameters and their application in clustering

Gaussian mixture modeling and model-based clustering under measurement inconsistency

A robust model-based clustering based on the geometric median and the median covariation matrix

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Mixture model averaging for clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The parsimonious Gaussian mixture models with partitioned parameters and their application in clustering

Gaussian mixture modeling and model-based clustering under measurement inconsistency

A robust model-based clustering based on the geometric median and the median covariation matrix

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation