Abstract
With a plethora of data capturing modalities becoming available, the same data object often leaves different kinds of digital footprints. This naturally leads to datasets comprising the same set of data objects represented in different forms, called multi-view data. Among the most fundamental tasks in unsupervised learning is that of clustering, the task of grouping data objects into groups of related objects. Multi-view clustering (MVC) is a flourishing field in unsupervised learning; the MVC task considers leveraging multiple views of data objects in order to arrive at a more effective and accurate grouping than what can be achieved by just using one view of data. Multi-view clustering methods differ in the kind of modelling they use in order to fuse multiple views, by managing the synergies, complimentarities, and conflicts across data views, and arriving at a single clustering output across the multiple views in the dataset. This chapter provides a survey of a sample of multi-view clustering methods, with an emphasis on bringing out the wide diversity in solution formulations that have been considered. We pay specific attention to enable the reader understand the intuition behind each method ahead of describing the technical details of the method, to ensure that the survey is accessible to readers who may not be machine learning specialists. We also outline some popular datasets that have been used to empirically evaluate MVC methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Balachandran, V., Deepak, P., Khemani, D.: Interpretable and reconfigurable clustering of document datasets by deriving word-based rules. Knowl. Inf. Syst. 32(3), 475–503 (2012)
Bickel, S., Scheffer, T.: Multi-view clustering. In: ICDM, vol. 4, pp. 19–26 (2004)
Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM, New York (1998)
Borzsony, S., Kossmann, D., Stocker, K.: The skyline operator. In: 2001 Proceedings of the 17th International Conference on Data Engineering, pp. 421–430. IEEE, Piscataway (2001)
Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1548–1560 (2011)
Cai, X., Nie, F., Huang, H.: Multi-view k-means clustering on big data. In: IJCAI, pp. 2598–2604 (2013)
Chen, X., Xu, X., Huang, J.Z., Ye, Y.: Tw-k-means: automated two-level variable weighting clustering algorithm for multiview data. IEEE Trans. Knowl. Data Eng. 25(4), 932–944 (2013)
Deepak, P.: Mixkmeans: clustering question-answer archives. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1576–1585 (2016)
Deepak, P., Garg, D., Shevade, S.: Latent space embedding for retrieval in question-answer archives. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 855–865 (2017)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Methodol. 39(1), 1–38 (1977)
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269–274. ACM, New York (2001)
Ding, C., He, X., Simon, H.D.: Nonnegative Lagrangian relaxation of K-means and spectral clustering. In: European Conference on Machine Learning. pp. 530–538. Springer, Berlin (2005)
Ding, C.H., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95(25), 14863–14868 (1998)
Fred, A.L., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Hodge, V., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)
Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 289–296. Morgan Kaufmann Publishers Inc., San Francisco (1999)
Hussain, S.F., Bashir, S.: Co-clustering of multi-view datasets. Knowl. Inf. Syst. 47(3), 545–570 (2016)
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
Jiang, Y., Liu, J., Li, Z., Lu, H.: Collaborative PLSA for multi-view clustering. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 2997–3000. IEEE, Piscataway (2012)
Jiang, B., Qiu, F., Wang, L.: Multi-view clustering via simultaneous weighting on views and features. Appl. Soft Comput. 47, 304–315 (2016)
Jing, L., Ng, M.K., Huang, J.Z.: An entropy weighting K-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans. Knowl. Data Eng. 19(8), 1026–1041 (2007)
Kim, Y.M., Amini, M.R., Goutte, C., Gallinari, P.: Multi-view clustering of multilingual documents. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 821–822. ACM, New York (2010)
Kumar, A., Daumé, H.: A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 393–400 (2011)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788 (1999)
Liao, T.W.: Clustering of time series data—a survey. Pattern Recogn. 38(11), 1857–1874 (2005)
Liu, J., Wang, C., Gao, J., Han, J.: Multi-view clustering via joint nonnegative matrix factorization. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp. 252–260. SIAM, Philadelphia (2013)
Liu, H., Liu, T., Wu, J., Tao, D., Fu, Y.: Spectral ensemble clustering. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 715–724. ACM, New York (2015)
Livescu, K., Sridharan, K., Kakade, S., Chaudhuri, K.: Multi-view clustering via canonical correlation analysis. In: NIPS Workshop: Learning from Multiple Sources (2008)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, vol. 1, pp. 281–297 (1967)
Meng, X., Liu, X., Tong, Y., Glänzel, W., Tan, S.: Multi-view clustering with exemplars for scientific mapping. Scientometrics 105(3), 1527–1552 (2015)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint ℓ2, 1-norms minimization. In: Advances in Neural Information Processing Systems, pp. 1813–1821 (2010)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Tao, Z., Liu, H., Li, S., Ding, Z., Fu, Y.: From ensemble clustering to multi-view clustering. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pp. 2843–2849 (2017)
Thompson, B.: Canonical correlation analysis. In: Encyclopedia of Statistics in Behavioral Science. Wiley, West Sussex (2005)
Wang, H., Nie, F., Huang, H.: Multi-view clustering and feature learning via structured sparsity. In: International Conference on Machine Learning, pp. 352–360 (2013)
Wang, X., Qian, B., Ye, J., Davidson, I.: Multi-objective multi-view spectral clustering via pareto optimization. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp. 234–242. SIAM, Philadelphia (2013)
Wang, D., Yin, Q., He, R., Wang, L., Tan, T.: Multi-view clustering via structured low-rank representation. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1911–1914. ACM, New York (2015)
Wang, C.D., Lai, J.H., Philip, S.Y.: Multi-view clustering based on belief propagation. IEEE Trans. Knowl. Data Eng. 28(4), 1007–1021 (2016)
Wang, Y., Chen, L., Li, X.L.: Multiple medoids based multi-view relational fuzzy clustering with minimax optimization. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 2971–2977 (2017)
Xu, J., Han, J., Nie, F.: Discriminatively embedded K-means for multi-view clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5356–5364 (2016)
Xu, Y.M., Wang, C.D., Lai, J.H.: Weighted multi-view clustering with feature selection. Pattern Recogn. 53, 25–35 (2016)
Zhang, X., Zhang, X., Liu, H.: Multi-task multi-view clustering for non-negative data. In: IJCAI, pp. 4055–4061 (2015)
Zhang, X., Zong, L., Liu, X., Yu, H.: Constrained NMF-based multi-view clustering on unmapped data. In: AAAI, pp. 3174–3180 (2015)
Zhao, H., Ding, Z., Fu, Y.: Multi-view clustering via deep matrix factorization. In: AAAI, pp. 2921–2927 (2017)
Zong, L., Zhang, X., Zhao, L., Yu, H., Zhao, Q.: Multi-view clustering via multi-manifold regularized non-negative matrix factorization. Neural Netw. 88, 74–89 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
P, D., Jurek-Loughrey, A. (2019). Multi-View Clustering. In: P, D., Jurek-Loughrey, A. (eds) Linking and Mining Heterogeneous and Multi-view Data. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-01872-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-01872-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01871-9
Online ISBN: 978-3-030-01872-6
eBook Packages: EngineeringEngineering (R0)