{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,9]],"date-time":"2023-10-09T05:14:32Z","timestamp":1696828472385},"reference-count":32,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2012,5,3]],"date-time":"2012-05-03T00:00:00Z","timestamp":1336003200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Statistical Analysis"],"published-print":{"date-parts":[[2012,8]]},"abstract":"Abstract<\/jats:title>Many modern data mining applications are concerned with the analysis of datasets in which the observations are described by paired high\u2010dimensional vectorial representations or \u2018views\u2019. Some typical examples can be found in web mining and genomics applications. In this article we present an algorithm for data clustering with multiple views, multi\u2010view predictive partitioning (MVPP), which relies on a novel criterion of predictive similarity between data points. We assume that, within each cluster, the dependence between multivariate views can be modeled by using a two\u2010block partial least squares (TB\u2010PLS) regression model, which performs dimensionality reduction and is particularly suitable for high\u2010dimensional settings. The proposed MVPP algorithm partitions the data such that the within\u2010cluster predictive ability between views is maximized. The proposed objective function depends on a measure of predictive influence of points under the TB\u2010PLS model which has been derived as an extension of the predicted residual sums of squares (PRESS) statistic commonly used in ordinary least squares regression. Using simulated data, we compare the performance of MVPP to that of competing multi\u2010view clustering methods which rely upon geometric structures of points, but ignore the predictive relationship between the two views. State\u2010of\u2010art results are obtained on benchmark web mining datasets. \u00a9 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2012<\/jats:p>","DOI":"10.1002\/sam.11144","type":"journal-article","created":{"date-parts":[[2012,5,3]],"date-time":"2012-05-03T18:09:32Z","timestamp":1336068572000},"page":"304-321","source":"Crossref","is-referenced-by-count":2,"title":["Multi\u2010view predictive partitioning in high dimensions"],"prefix":"10.1002","volume":"5","author":[{"given":"Brian","family":"McWilliams","sequence":"first","affiliation":[]},{"given":"Giovanni","family":"Montana","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2012,5,3]]},"reference":[{"key":"e_1_2_9_2_2","doi-asserted-by":"publisher","DOI":"10.1093\/biostatistics\/kxp008"},{"key":"e_1_2_9_3_2","unstructured":"S.BickelandT.Scheffer Multi\u2010view clustering In Proceedings of the IEEE International Conference on Data Mining Citeseer 2004."},{"key":"e_1_2_9_4_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007379606734"},{"key":"e_1_2_9_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-010-5169-8"},{"key":"e_1_2_9_6_2","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972788.72"},{"key":"e_1_2_9_7_2","first-page":"723","volume-title":"Advances in Neural Information Processing Systems","author":"Lange T.","year":"2006"},{"key":"e_1_2_9_8_2","doi-asserted-by":"crossref","unstructured":"B.Long P.Yu andZ.Zhang A general model for multiple view unsupervised learning In Proceedings of the 2008 SIAM International Conference on Data Mining 2008.","DOI":"10.1137\/1.9781611972788.74"},{"key":"e_1_2_9_9_2","doi-asserted-by":"crossref","first-page":"736","DOI":"10.1145\/1571941.1572103","volume-title":"Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Bruno E.","year":"2009"},{"key":"e_1_2_9_10_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04277-5_21"},{"key":"e_1_2_9_11_2","doi-asserted-by":"crossref","unstructured":"K.Chaudhuri S. M.Kakade K.Livescu andK.Sridharan Multi\u2010view clustering via canonical correlation analysis In Proceedings of the 26th Annual International Conference on Machine Learning \u2010 ICML '09 2009.1\u20138.","DOI":"10.1145\/1553374.1553391"},{"key":"e_1_2_9_12_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-009-5157-z"},{"key":"e_1_2_9_13_2","unstructured":"J.Wegelin A Survey of Partial Least Squares (PLS) Methods with Emphasis on the Two\u2010Block Case Technical Report University of Washington 2000."},{"key":"e_1_2_9_14_2","doi-asserted-by":"publisher","DOI":"10.1007\/11752790_2"},{"key":"e_1_2_9_15_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-10-34"},{"key":"e_1_2_9_16_2","doi-asserted-by":"publisher","DOI":"10.1002\/sam.10074"},{"key":"e_1_2_9_17_2","doi-asserted-by":"publisher","DOI":"10.3150\/bj\/1106314847"},{"key":"e_1_2_9_18_2","doi-asserted-by":"publisher","DOI":"10.1214\/ss\/1056397488"},{"key":"e_1_2_9_19_2","unstructured":"S.Dudoit J.Fridlyand andT.Speed Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Technical Report UC Berkeley 2000."},{"key":"e_1_2_9_20_2","doi-asserted-by":"publisher","DOI":"10.2202\/1544-6115.1406"},{"key":"e_1_2_9_21_2","volume-title":"Regression diagnostics: Identifying influential data and sources of collinearity","author":"Belsley D.","year":"2004"},{"key":"e_1_2_9_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0003-2670(01)01040-6"},{"key":"e_1_2_9_23_2","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1176345513"},{"key":"e_1_2_9_24_2","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/61.3.509"},{"key":"e_1_2_9_25_2","doi-asserted-by":"publisher","DOI":"10.1137\/0905052"},{"key":"e_1_2_9_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0950-3293(99)00039-7"},{"key":"e_1_2_9_27_2","volume-title":"Matrix Computations","author":"Golub G.","year":"1996"},{"key":"e_1_2_9_28_2","first-page":"1","volume-title":"Manchester, UK, 11th Annual Workshop on Computational Intelligence,","author":"McWilliams B.","year":"2010"},{"key":"e_1_2_9_29_2","doi-asserted-by":"publisher","DOI":"10.1561\/0400000025"},{"key":"e_1_2_9_30_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2009.07.015"},{"key":"e_1_2_9_31_2","doi-asserted-by":"publisher","DOI":"10.1007\/11811305_19"},{"key":"e_1_2_9_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/11564096_9"},{"key":"e_1_2_9_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-45528-0"}],"container-title":["Statistical Analysis and Data Mining: The ASA Data Science Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fsam.11144","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fsam.11144","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/sam.11144","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,8]],"date-time":"2023-10-08T12:13:59Z","timestamp":1696767239000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/sam.11144"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,5,3]]},"references-count":32,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,8]]}},"alternative-id":["10.1002\/sam.11144"],"URL":"https:\/\/doi.org\/10.1002\/sam.11144","archive":["Portico"],"relation":{},"ISSN":["1932-1864","1932-1872"],"issn-type":[{"value":"1932-1864","type":"print"},{"value":"1932-1872","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,5,3]]}}}