Abstract
Clustering of objects according to shapes is of key importance in many scientific fields. In this paper we focus on the case where the shape of an object is represented by a configuration matrix of landmarks. It is well known that this shape space has a finite-dimensional Riemannian manifold structure (non-Euclidean) which makes it difficult to work with. Papers about clustering on this space are scarce in the literature. The basic foundation of the \(k\)-means algorithm is the fact that the sample mean is the value that minimizes the Euclidean distance from each point to the centroid of the cluster to which it belongs, so, our idea is integrating the Procrustes type distances and Procrustes mean into the \(k\)-means algorithm to adapt it to the shape analysis context. As far as we know, there have been just two attempts in that way. In this paper we propose to adapt the classical \(k\)-means Lloyd algorithm to the context of Shape Analysis, focusing on the three dimensional case. We present a study comparing its performance with the Hartigan-Wong \(k\)-means algorithm, one that was previously adapted to the field of Statistical Shape Analysis. We demonstrate the better performance of the Lloyd version and, finally, we propose to add a trimmed procedure. We apply both to a 3D database obtained from an anthropometric survey of the Spanish female population conducted in this country in 2006. The algorithms presented in this paper are available in the Anthropometry R package, whose most current version is always available from the Comprehensive R Archive Network.
Similar content being viewed by others
References
Alemany S, González JC, Nácher B, Soriano C, Arnáiz C, Heras H (2010) Anthropometric survey of the spanish female population aimed at the apparel industry. In: Proceedings of the 2010 Intl Conference on 3D Body scanning Technologies, Lugano, Switzerland, pp 1–10
Amaral G, Dore L, Lessa R, Stosic B (2010) k-means algorithm in statistical shape analysis. Commun Stat Simul Comput 39(5):1016–1026
Anderberg M (1973) Cluster analysis for applications. Academic Press, New York
Best D, Fisher N (1979) Efficient simulation of the von mises distribution. J R Stat Soc Ser C (Appl Stat) 28(2):152–157
Bhattacharya R, Patrangenaru V (2002) Nonparametric estimation of location and dispersion on riemannian manifolds. J Stat Plann Inference 108:23–35
Bhattacharya R, Patrangenaru V (2003) Large sample theory of intrinsic and extrinsic sample means on manifolds. Ann Stat 31(1):1–29
Bock HH (2007) Clustering methods: a history of k-means algorithms. In: Brito P, Bertrand P, Cucumel G, de Carvalho F (eds) Selected contributions in data analysis and classification. Springer, Berlin Heidelberg, pp 161–172
Bock HH (2008) Origins and extensions of the k-means algorithm in cluster analysis. Electron J Hist Prob Stat 4(2):1–18
Cai X, Li Z, Chang CC, Dempsey P (2005) Analysis of alignment influence on 3-D anthropometric statistics. Tsinghua Sci Technol 10(5):623–626
Chernoff H (1970) Metric considerations in cluster analysis. In: Proc. 6th Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, pp 621–629
Chung M, Lina H, Wang MJJ (2007) The development of sizing systems for taiwanese elementary- and high-school students. Int J Ind Ergon 37:707–716
Claude J (2008) Morphometrics with R. use R!. Springer, New York
Dryden IE, Mardia KV (1998) Statistical shape analysis. Wiley, Chichester
Dryden IL (2012) Shapes package. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org, contributed package
European Committee for Standardization. European Standard EN 13402–2: Size system of clothing. Primary and secondary dimensions (2002)
Fletcher P, Lu C, Pizer S, Joshi S (2004) Principal geodesic analysis for the study of nonlinear statistics of shape. Med Imaging IEEE Trans 23:995–1005
Fréchet M (1948) Les éléments aléatoires de nature quelconque dans un espace distancié. Ann Inst Henri Poincare Prob Stat 10(4):215–310
García-Escudero LA, Gordaliza A (1999) Robustness properties of k-means and trimmed k-means. J Am Stat Assoc 94(447):956–969
Georgescu V (2009) Clustering of fuzzy shapes by integrating Procrustean metrics and full mean shape estimation into k-means algorithm. In: IFSA-EUSFLAT Conference (Lisbon, Portugal), pp 1679–1684
Hand DJ, Krzanowski WJ (2005) Optimising k-means clustering results with standard software packages. Comput Stat Data Anal 49:969.973 short communication
Hartiga JA, Wong MA (1979) A K-means clustering algorithm. Appl Stat 100–108
Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning. Springer, New York
Ibáñez MV, Vinué G, Alemany S, Simó A, Epifanio I, Domingo J, Ayala G (2012) Apparel sizing using trimmed PAM and OWA operators. Expert Syst Appl 39:10,512–10,520
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31:651–666
Kanungo T, Mount DM, Netanyahu NS, Piatko C, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Karcher H (1977) Riemannian center of mass and mollifier smoothing. Commun Pure Appl Math 30(5):509–541
Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Kendall D (1977) The diffusion of shape. Adv Appl Prob 9:428–430
Kendall DG, Barden D, Carne T, Le H (2009) Shape and shape theory. Wiley, Chichester
Kendall WS (1990) Probability, convexity, and harmonic maps with small image i: uniqueness and fine existence. Proc Lond Math Soc 3(2):371–406
Kent J, Mardia K (1997) Consistency of procrustes estimators. J R Stat Soc Ser B 59(1):281–290
Kobayashi S, Nomizu K (1969) Foundations of differential geometry, vol 2. Wiley, Chichester
Lawing A, Polly P (2010) Geometric morphometrics: recent applications to the study of evolution and development. J Zool 280(1):1–7
Le H (1998) On the consistency of Procrustean mean shapes. Adv Appl Prob 30(1):53–63
Lloyd SP (1957) Least squares quantization in pcm. bell telephone labs memorandum, murray hill, nj. reprinted. In: IEEE Trans Information Theory IT-28 (1982) 2:129–137
MacQueen J (1967) Some methoods for classification and analysis of mulivariate observations. In: Proc 5th Berkely Symp Math Statist Probab. Univ of California Press B (ed) 1965/66, vol 1, pp 281–297
Nazeer KAA, Sebastian MP (2009) Improving the accuracy and efficiency of the k-means clustering algorithm. In: Proceedings of the World Congress on Engineering (London, UK), pp 1–5
Ng R, Ashdown S, Chan A (2007) Intelligent size table generation. Sen’i Gakkaishi (J Soc Fiber Sci Technol Jpn) 63(11):384–387
Pennec X (2006) Intrinsic statistics on riemannian manifolds: basic tools for geometric measurements. J Math Imaging Vis 25(1):127–154
Qiu W, Joe H (2013) ClusterGeneration: random cluster generation (with specified degree of separation. http://CRAN.R-project.org/package=clusterGeneration, R package version 1.3.1
R Development Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org, ISBN 3-900051-07-0
Rohlf JF (1999) Shape statistics: Procrustes superimpositions and tangent spaces. J Classif 16:197–223
S-plus original by Ulric Lund and R port by Claudio Agostinelli (2012) CircStats: Circular Statistics, from “Topics in circular Statistics” (2001). http://CRAN.R-project.org/package=CircStats, R package version 0.2–4
Simmons K (2002) Body shape analysis using three-dimensional body scanning technology. PhD thesis, North Carolina State University
Small C (1996) The statistical theory of shape. Springer, New York
Sokal R, Sneath PH (1963) Principles of numerical taxonomy. Freeman, San Francisco
Steinhaus H (1956) Sur la division des corps matériels en parties. Bull Acad Pol Sci IV(12):801–804
Steinley D (2006) K-means clustering: a half-century synthesis. Br J Math Stat Psychol 59:1–34
Stoyan LA, Stoyan H (1995) Fractals, random shapes and point fields. Wiley, Chichester
Theodoridis S, Koutroumbas K (1999) Pattern recognition. Academic, New York
Veitch D, Fitzgerald C et al (2013) Sizing up Australia—the next step. Safe Work Australia, Canberra
Vinué G, Epifanio I, Simó A, Ibáñez MV, Domingo J, Ayala G (2014) Anthropometry: an R Package for analysis of anthropometric data. http://CRAN.R-project.org/package=Anthropometry, R package version 1.0
Woods R (2003) Characterizing volume and surface deformations in an atlas framework: theory, applications, and implementation. NeuroImage 18:769–788
Zheng R, Yu W, Fan J (2007) Development of a new chinese bra sizing system based on breast anthropometric measurements. Int J Ind Ergon 37:697–705
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vinué, G., Simó, A. & Alemany, S. The \(k\)-means algorithm for 3D shapes with an application to apparel design. Adv Data Anal Classif 10, 103–132 (2016). https://doi.org/10.1007/s11634-014-0187-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-014-0187-1
Keywords
- Shape space
- Statistical shape analysis
- \(k\)-means algorithm
- Procrustes type distances
- Procrustes mean shape
- Sizing systems