Transformation mixture modeling for skewed data groups with heavy tails and scatter | Computational Statistics Skip to main content
Log in

Transformation mixture modeling for skewed data groups with heavy tails and scatter

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

For decades, Gaussian mixture models have been the most popular mixtures in literature. However, the adequacy of the fit provided by Gaussian components is often in question. Various distributions capable of modeling skewness or heavy tails have been considered in this context recently. In this paper, we propose a novel contaminated transformation mixture model that is constructed based on the idea of transformation to symmetry and can account for skewness, heavy tails, and automatically assign scatter to secondary components.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Andrews DF, Gnanadesikan R, Warner JL (1971) Transformations of multivariate data. Biometrics 27(4):825–840

    Article  Google Scholar 

  • Atkinson AC, Riani M, Cerioli A (2003) Exploring multivariate data with the forward search. Clarendon Press, Oxford

    MATH  Google Scholar 

  • Azzalini A, Bowman AW (1990) A look at some data on the Old Faithful Geyser. J R Stat Soc C 39:357–365

    MATH  Google Scholar 

  • Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821

    Article  MathSciNet  Google Scholar 

  • Basso R, Lachos V, Cabral C, Ghosh P (2010) Robust mixture modeling based on scale mixtures of skew-normal distributions. Comput Stat Data Anal 54:2926–2941

    Article  MathSciNet  Google Scholar 

  • Box GE, Cox DR (1964) An analysis of transformations. J R Stat Soc B 26(2):211–252

    MATH  Google Scholar 

  • Browne RP, McNicholas PD (2015) A mixture of generalized hyperbolic distributions. Can J Stat 43(2):176–198

    Article  MathSciNet  Google Scholar 

  • Cabral C, Lachos V, Prates M (2012) Multivariate mixture modeling using skew-normal independent distributions. Comput Stat Data Anal 56(1):126–142

    Article  MathSciNet  Google Scholar 

  • Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Comput Stat Data Anal 28:781–793

    Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood for incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39(1):1–38

    MATH  Google Scholar 

  • Forina M, Leardi R, Armanino C, Lanteri S (1991) PARVUS: an extendible package for data exploration, classification and correlation. Institute of Pharmaceutical and Food Analysis and Technologies, Via Brigata Salerno

  • Frühwirth-Schnatter S, Pyne S (2010) Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-\(t\) distributions. Biostatistics 11:317–336

    Article  Google Scholar 

  • Giorgi E, McNeil AJ (2014) On the computation of multivariate scenario sets for the skew-\(t\) and generalized hyperbolic families. Comput Stat Data Anal 100:205–220

    Article  MathSciNet  Google Scholar 

  • Lee S, McLachlan GJ (2013) On mixtures of skew normal and skew \(t\)-distributions. Adv Data Anal Classif 7(3):241–266

    Article  MathSciNet  Google Scholar 

  • Lee S, McLachlan G J (2014) Finite mixtures of multivariate skew \(t\)-distributions: some recent and new results. Stat Comput 24(2):181–202

    Article  MathSciNet  Google Scholar 

  • Lin TI (2009) Maximum likelihood estimation for multivariate skew normal mixture models. J Multivar Anal 100(2):257–265

    Article  MathSciNet  Google Scholar 

  • Lin T-C, Lin T-I (2009) Supervised learning of multivariate skew normal mixture models with missing information. Comput Stat 25:183–201

    Article  MathSciNet  Google Scholar 

  • Lin TI, Lee JC, Yen SY (2007) Finite mixture modelling using the skew normal distribution. Stat Sin 17:909–927

    MathSciNet  MATH  Google Scholar 

  • Lo K, Gottardo R (2012) Flexible mixture modeling via the multivariate \(t\) distribution with the Box-Cox transformation: an alternative to the skew-\(t\) distribution. Stat Comput 22(1):35–52

    Article  MathSciNet  Google Scholar 

  • Maitra R, Melnykov V (2010) Simulating data to study performance of finite mixture modeling and clustering algorithms. J Comput Graph Stat 19(2):354–376

    Article  MathSciNet  Google Scholar 

  • Manly BFJ (1976) Exponential data transformations. J R Stat Soc Ser D 25(1):37–42

    MathSciNet  Google Scholar 

  • McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

    Book  Google Scholar 

  • McNicholas PD (2017) Mixture model-based classification. CRC Press, Boca Raton

    MATH  Google Scholar 

  • McNicholas P, Murphy T (2008) Parsimonious Guassian mixture models. Stat Comput 18:285–296

    Article  MathSciNet  Google Scholar 

  • Melnykov V (2016) Model-based biclustering of clickstream data. Comput Stat Data Anal 93C:31–45

    Article  MathSciNet  Google Scholar 

  • Morris K, Punzo A, McNicholas P, Browne R (2019) Asymmetric clusters and outliers: mixtures of multivariate contaminated shifted asymmetric Laplace distributions. Comput Stat Data Anal 132:145–156

    Article  MathSciNet  Google Scholar 

  • Nelder JA, Mead R (1965) A simplex algorithm for function minimization. Comput J 7(4):308–313

    Article  MathSciNet  Google Scholar 

  • Punzo A, McNicholas P (2016) Parsimonious mixtures of multivariate contaminated normal distributions. Biom J 58:1506–1537

    Article  MathSciNet  Google Scholar 

  • Schwarz G (1978) Estimating the dimensions of a model. Ann Stat 6(2):461–464

    Article  MathSciNet  Google Scholar 

  • Velilla S (1993) A note on the multivariate Box-Cox transformation to normality. Stat Probab Lett 17(4):259–263

    Article  MathSciNet  Google Scholar 

  • Wang K, Ng A, McLachlan G (2013) EMMIXskew: the EM algorithm and skew mixture distribution. R package version 1.0.1

  • Yeo I-K, Johnson RA (2000) A new family of power transformations to improve normality or symmetry. Biometrika 87:954–959

    Article  MathSciNet  Google Scholar 

  • Zhu X, Melnykov V (2018) Manly transformation in finite mixture modeling. Comput Stat Data Anal 121:190–208

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuwen Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Melnykov, Y., Zhu, X. & Melnykov, V. Transformation mixture modeling for skewed data groups with heavy tails and scatter. Comput Stat 36, 61–78 (2021). https://doi.org/10.1007/s00180-020-01009-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-020-01009-8

Keywords

Navigation