Abstract
A new robust fuzzy regression clustering method is proposed. We estimate coefficients of a linear regression model in each unknown cluster. Our method aims to achieve robustness by trimming a fixed proportion of observations. Assignments to clusters are fuzzy: observations contribute to estimates in more than one single cluster. We describe general criteria for tuning the method. The proposed method seems to be robust with respect to different types of contamination.
Similar content being viewed by others
References
Ali AM, Karmakar GC, Dooley LS (2008) Review on fuzzy clustering algorithms. J Adv Comput 2:169–181
Bezdek JC (1981) Pattern recognition with fuzzy objective function algoritms. Plenum Press, New York
Bock HH (1969) The equivalence of two extremal problems and its application to the iterative classification of multivariate data. Paper presented at the Workshop “Medizinische Statistik”, Forschungsinstitut Oberwolfach
Bryant PG (1991) Large-sample results for optimization-based clustering methods. J Classif 8:31–44
Celeux G, Govaert A (1992) Classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 13:315–332
Cerioli A, Farcomeni A, Riani M (2013) Robust distances for outlier free goodness-of-fit testing. Comput Stat Data Anal 65:29–45
Cerioli A, Farcomeni A (2011) Error rates for multivariate outlier detection. Comput Stat Data Anal 55:544–553
Coretto P, Hennig C (2016) Robust improper maximum likelihood: tuning, computation and a comparison with other methods for robust Gaussian clustering. J Am Stat Assoc (in press)
DeSarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5:249–282
D’Urso P, Massari R, Santoro A (2011) Robust fuzzy regression analysis. Inf Sci 18:4154–4174
D’Urso P, De Giovanni L, Massari R (2014) Trimmed fuzzy clustering for interval-values data. Adv Data Anal Classif 9:21–40
Farcomeni A (2014a) Snipping for robust \(k\)-means clustering under component-wise contamination. Stat Comput 24:909–917
Farcomeni A (2014b) Robust constrained clustering in presence of entry-wise outliers. Technometrics 56:102–111
Farcomeni A, Greco L (2015) Robust methods for data reduction. Chapman and Hall/CRC Press, Boca Raton
Fritz H, García-Escudero LA, Mayo-Iscar A (2013a) Robust constrained fuzzy clustering. Inf Sci 245:38–52
Fritz H, García-Escudero LA, Mayo-Iscar A (2013b) A fast algorithm for robust constrained clustering. Comput Stat Data Anal 61:124–136
García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36:1324–1345
García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2010) Robust clusterwise linear regression through trimming. Comput Stat Data Anal 54:3057–3069
García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2011) Exploring the number of groups in robust model-based clustering. Stat Comput 21:585–599
Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 11:773–781
Gustafson DE, Kessel WC (1979) Fuzzy clustering with a fuzzy covariance matrix. In: Proceedings of the IEEE international conference on fuzzy systems, vol 25, pp 761–766
Hathaway RJ, Bezdek JC (1993) Switching regression models and fuzzy clustering. IEEE Trans Fuzzy Syst 1:195–204
Hennig C, Liao TF (2013) How to find an appropriate clustering for mixed types of variables with application to socioeconomic stratification. J R Stat Sci Ser C (Appl Stat) 62:309–369
Honda K, Ohyama T, Ichihashi H, Notsu A (2008) FCM-type switching regression with alternating least square method. In: Proceedings of the IEEE international conference on fuzzy systems (FUZZ 2008), pp 122–127
Hosmer DW Jr (1974) Maximum likelihood estimates of the parameters of a mixture of two regression lines. Commun Stat 3:995–1006
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Ingrassia S, Minotti SC, Punzo A (2014) Model-based clustering via linear cluster-weighted models. Comput Stat Data Anal 71:159–182
Kim J, Krishnapuram R, Davé RN (1996) Application of the least trimmed squares technique to prototype-based clustering. Pattern Recognit Lett 17:633–641
Leisch F (2006) A toolbox for K-centroids cluster analysis. Comput Stat Data Anal 51:526–544
Lenstra AK, Lenstra JK, Rinnooy Kan AHG, Wansbeek TJ (1982) Two lines least squares. Ann Discrete Math 66:201–211
McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York
Perry PO (2009) Cross-validation for unsupervised learning. arXiv:0909.3052
Ritter G (2015) Robust cluster analysis and variable selection. CRC Press, Boca Raton
Rousseeuw PJ, Kaufman L, Trauwaert E (1996) Fuzzy clustering using scatter matrices. Comput Stat Data Anal 23:135–151
Ruspini EH (1969) A new approach to clustering. Inf Control 29:22–32
Sadaaki M, Masao M (1997) Fuzzy \(c\)-means as a regularization and maximum entropy approach. In: Proceedings of the 7th international fuzzy systems association world congress (IFSA’97), vol 2. University of Economics, Prague, pp 86–92
Song W, Yao W, Xing Y (2014) Robust mixture regression model fitting by Laplace distribution. Comput Stat Data Anal 71:128–137
Späth H (1982) A fast algorithm for clusterwise linear regression. Computing 29:175–181
Symons MJ (1981) Clustering criteria and multivariate normal mixtures. Biometrics 37:35–43
Trauwaert E, Kaufman L, Rousseeuw P (1991) Fuzzy clustering algorithms based on the maximum likelihood principle. Fuzzy Sets Syst 42:213–227
Wu KL, Yang MS, Hsieh, JN (2009) Alternative fuzzy switching regression. In: Proceedings of the international multiconference of engineers and computer scientists 2009 (IMECS 2009), 18–20 Mar, vol 1. Newswood Limited, Hong Kong
Yao W, Li L (2014) A new regression model: modal linear regression. Scand J Stat 41:656–671
Acknowledgments
The authors are grateful to three referees and the Associated Editor for several constructive suggestions. Research partially supported by the Spanish Ministerio de Economía y Competitividad, Grant MTM2014-56235-C2-1-P, and by Consejería de Educación de la Junta de Castilla y León, Grant VA212U13.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Dotto, F., Farcomeni, A., García-Escudero, L.A. et al. A fuzzy approach to robust regression clustering. Adv Data Anal Classif 11, 691–710 (2017). https://doi.org/10.1007/s11634-016-0271-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-016-0271-9