Abstract
A new algorithm for clustering documents and words simultaneously has recently been presented. As most spectral clustering algorithms, the prior knowledge of the number of clusters present is required. In this paper, we explore a method based on morphology for determining the number of clusters present in the given dataset for co-clustering documents and words. The proposed method employs some refined feature extraction techniques, which mainly include a VAT (Visual Assessment of Cluster Tendency) image representation of input matrix generated by spectral co-clustering documents and words, and the texture information obtained by filtering the VAT image. The number of clusters present in co-clustering documents and words is finally reported by computing the eigengap of gray-scale matrix of filtered image. Our experimental results show that the proposed method works well in practice.
This work is supported by the National Natural Science Foundation of China (No.61073133, No.60973067).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)
Brown, G.J., Cooke, M.P.: Computational auditory scene analysis. Computer Speech and Language 8, 297–333 (1994)
Bach, F.R., Jordan, M.I.: Blind one microphone speech separation: A spectral learning approach. In: Advances in Neural Information Processing System, vol. 17, pp. 65–72 (2005)
Ng, A.Y., Jordan, M.L., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Advances in Neural Information Processing Systems, vol. 14, pp. 849–856 (2001)
Donath, W., Hoffman, A.: Lower bounds for the partitioning of graphs. IBM Journal of Research and Development 17, 420–425 (1973)
Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Mathematical Journal 23, 298–305 (1973)
Prieto, R., Jiang, J., Choi, C.H.: A New Spectral Clustering Algorithm for Large Training Sets. In: International Conference on Machine Learning and Cybernetics, pp. 147–152 (2003)
Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st International Conference on Machine Learning, pp. 281–288 (2004)
Sanguinetti, G., Laidler, J., Lawrence, N.: Automatic Determination of the Number of Clusters Using Spectral Algorithms. In: Proc. of IEEE Machine Learning for Signal Processing, pp. 28–30 (2005)
Fowlkes, C., Belongie, S., Chung, F.: Spectral grouping using the Nystrom method. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 217–225 (2007)
Xu, S., Lu, Z.-M., Gu, G.-C.: Two Spectral Algorithms for Ensembling Document Clusters. Acta Automatica Sinica 35, 997–1002 (2009)
Dhillon, I.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269–274 (2001)
Polito, M., Perona, P.: Grouping and dimensionality reduction by locally linear embedding. In: Advances in Neural Information Processing Systems, vol. 14, pp. 1255–1262 (2002)
Zheng, X., Lin, X.: Automatic determination of intrinsic cluster number family in spectral clustering using random walk on graph. In: Proceedings of the 2004 International Conference on Image Processing, pp. 3471–3474 (2004)
Li, K., Liu, Y.S.: A spectral clustering algorithm based on self-adaption. In: Proceedings of ICMLC 2007 Conference, pp. 3965–3968 (2007)
Cai, X., Dai, G., Yang, L., Zhang, G.: A Self-Adaptive Spectral Clustering Algorithm. In: Proceedings of the 27th Chinese Control Conference, pp. 551–553 (2008)
Bezdek, J.C., Hathaway, R.J.: VAT: a tool for visual assessment of (cluster) tendency. In: Proc. of International Joint Conference on Neural Networks, pp. 2225–2230 (2002)
Soille, P.: Morphological Image Analysis: Principles and Applications. Springer, Heidelberg (1999)
Gilbert, W., Stewart, Sun, J.-g.: Matrix perturbation theory. Academic Press, San Diego (1990)
Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining partitionings. The Journal of Machine Learning Research 3, 583–617 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, N., Lu, M. (2011). A Morphology Method for Determining the Number of Clusters Present in Spectral Co-clustering Documents and Words. In: Akiyama, J., Bo, J., Kano, M., Tan, X. (eds) Computational Geometry, Graphs and Applications. CGGA 2010. Lecture Notes in Computer Science, vol 7033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24983-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-24983-9_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24982-2
Online ISBN: 978-3-642-24983-9
eBook Packages: Computer ScienceComputer Science (R0)