Abstract
Clustering is a major technique in data mining. However the numerical feedback of clustering algorithms is difficult for user to have an intuitive overview of the dataset that they deal with. Visualization has been proven to be very helpful for high-dimensional data analysis. Therefore it is desirable to introduce visualization techniques with user’s domain knowledge into clustering process. Whereas most existing visualization techniques used in clustering are exploration oriented. Inevitably, they are mainly stochastic and subjective in nature. In this paper, we introduce an approach called HOV3 (H ypothesis O riented V erification and V alidation by V isualization), which projects high-dimensional data on the 2D space and reflects data distribution based on user hypotheses. In addition, HOV3 enables user to adjust hypotheses iteratively in order to obtain an optimized view. As a result, HOV3 provides user an efficient and effective visualization method to explore cluster information.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alpern, B., Carter, L.: Hyperbox. In: Proc. Visualization 1991, San Diego, CA, pp. 133–139 (1991)
Ankerst, M., Breunig, M., Kriegel, S.H.J.: OPTICS: Ordering points to identify the clustering structure. In: Proc. of ACM SIGMOD Conference, pp. 49–60 (1999)
Ankerst, M., Keim, D.: Visual Data Mining and Exploration of Large Databases. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, Springer, Heidelberg (2001)
Berkhin, P.: Survey of clustering data mining techniques. Technical report, Accrue Software (2002)
Cook, D.R., Buja, A., Cabrea, J., Hurley, H.: Grand tour and projection pursuit. Journal of Computational and Graphical Statistics 23, 225–250 (1995)
Chen, K., Liu, L.: VISTA: Validating and Refining Clusters via Visualization. Journal of Information Visualization I3(4), 257–270 (2004)
Chernoff, H.: The Use of Faces to Represent Points in k-Dimensional Space Graphically. Journal Amer. Statistical Association 68, 361–368 (1973)
Cleveland, W.S.: Visualizing Data, AT&T Bell Laboratories, Murray Hill, NJ. Hobart Press, Summit NJ (1993)
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining (1996)
Fienberg, S.E.: Graphical methods in statistics. American Statisticians 33, 165–178 (1979)
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. In: Proc. of ACM SIGMOD Int’l Conf. on Management of Data, pp. 73–84. ACM Press, New York (1998)
Hinneburg, A., Keim, D.A., Wawryniuk, M.: HD-Eye-Visual Clustering of High dimensional Data. In: Proc. of the 19th International Conference on Data Engineering, pp. 753–755 (2003)
Hoffman, P.E., Grinstein, G.: A survey of visualizations for high-dimensional data mining. In: Fayyad, U., Grinstein, G.G., Wierse, A. (eds.) Information visualization in data mining and knowledge discovery, pp. 47–82. Morgan Kaufmann Publishers Inc., San Francisco (2002)
Inselberg, A.: Multidimensional Detective. In: Proc. of IEEE Information Visualization 1997, pp. 100–107 (1997)
Jain, A., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)
Kandogan, E.: Visualizing multi-dimensional clusters, trends, and outliers using star coordinates. In: Proc. of ACM SIGKDD Conference, pp. 107–116 (2001)
Keim, D.A., And Kriegel, H.: VisDB: Database Exploration using Multidimensional Visualization. Computer Graphics & Applications, 40–49 (1994)
de Oliveira, M.C.F., Levkowitz, H.: From Visual Data Exploration to Visual Data Mining: A Survey. IEEE Transaction on Visualization and Computer Graphs 9(3), 378–394 (2003)
Pampalk, E., Goebl, W., Widmer, G.: Visualizing Changes in the Structure of Data for Exploratory Feature Selection. In: SIGKDD 2003, Washington, DC, USA (2003)
Pickett, R.M.: Visual Analyses of Texture in the Detection and Recognition of Objects. In: Lipkin, B.S., Rosenfeld, A. (eds.) Picture Processing and Psycho-Pictorics, pp. 289–308. Academic Press, New York (1970)
Qian, Y., Zhang, G., Zhang, K.: FAÇADE: A Fast and Effective Approach to the Discovery of Dense Clusters in Noisy Spatial Data. In: Proc. ACM SIGMOD 2004 Conference, pp. 921–922. ACM Press, New York (2004)
Ribarsky, W., Katz, J., Jiang, F., Holland, A.: Discovery visualization using fast clustering. Computer Graphics and Applications, IEEE 19, 32–39 (1999)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multi-resolution clustering approach for very large spatial databases. In: Proc. of 24th Intl. Conf. On Very Large Data Bases, pp. 428–439 (1998)
Shneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 17–28. Springer, Heidelberg (2001)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases. In: Proc. of SIGMOD 1996, Montreal, Canada, pp. 103–114 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, KB., Orgun, M.A., Zhang, K. (2006). HOV3: An Approach to Visual Cluster Analysis. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_35
Download citation
DOI: https://doi.org/10.1007/11811305_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)