Abstract
Data understanding is an iterative process in which domain experts combine their knowledge with the data at hand to explore and confirm hypotheses. One important set of tools for exploring hypotheses about data are visualizations. Often, however, traditional, unsupervised dimensionality reduction algorithms are used for visualization. These tools allow for interaction, i.e., exploring different visualizations, only by means of manipulating some technical parameters of the algorithm. Therefore, instead of being able to intuitively interact with the visualization, domain experts have to learn and argue about these technical parameters. In this paper we propose a knowledge-based kernel PCA approach that allows for intuitive interaction with data visualizations. Each embedding direction is given by a non-convex quadratic optimization problem over an ellipsoid and has a globally optimal solution in the kernel feature space. A solution can be found in polynomial time using the algorithm presented in this paper. To facilitate direct feedback, i.e., updating the whole embedding with a sufficiently high frame-rate during interaction, we reduce the computational complexity further by incremental up- and down-dating. Our empirical evaluation demonstrates the flexibility and utility of this approach.
Parts of this work have been presented in workshops[18, 19].
Chapter PDF
Similar content being viewed by others
References
Andrews, C., Endert, A., North, C.: Space to Think: Large High-resolution Displays for Sensemaking. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 55–64. ACM (2010)
Arbenz, P.: Lecture Notes on Solving Large Scale Eigenvalue Problems, pp. 77–93. ETH Zürich (2012)
Bunch, J.R., Nielsen, C.P., Sorensen, D.: Rank-one modification of the symmetric eigenproblem. Numerische Mathematik 31 (1978)
Callahan, E., Koenemann, J.: A comparative usability evaluation of user interfaces for online product catalog. In: Proceedings of the 2nd ACM Conference on Electronic Commerce (EC), pp. 197–206. ACM (2000)
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press (2006)
Cox, T.F., Cox, M.A.A.: Multidimensional Scaling. Chapman and Hall/CRC (2000)
Dinuzzo, F., Schölkopf, B.: The representer theorem for Hilbert spaces: A necessary and sufficient condition. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS), pp. 189–196 (2012)
Endert, A., Han, C., Maiti, D., House, L., Leman, S., North, C.: Observation-level interaction with statistical models for visual analytics. In: IEEE VAST, pp. 121–130. IEEE (2011)
Forsythe, G.E., Golub, G.H.: On the Stationary Values of a Second-Degree Polynomial on the Unit Sphere. Journal of the Society for Industrial and Applied Mathematics 13(4), 1050–1068 (1965)
Gander, W.: Least Squares with a Quadratic Constraint. Numerische Mathematik 36, 291–308 (1981)
Gander, W., Golub, G., von Matt, U.: A constrained eigenvalue problem. Linear Algebra and its Applications 114-115, 815–839 (1989)
Ham, J., Lee, D.D., Mika, S., Schölkopf, B.: A Kernel View of the Dimensionality Reduction of Manifolds. In: Proceedings of the 21st International Conference on Machine Learning (2004)
Izenman, A.J.: Linear Discriminant Analysis. Springer (2008)
Jeong, D.H., Ziemkiewicz, C., Fisher, B.D., Ribarsky, W., Chang, R.: iPCA: An Interactive System for PCA-based Visual Analytics. Comput. Graph. Forum 28(3), 767–774 (2009)
Jolliffe, I.T.: Principal Component Analysis. Springer (1986)
Leman, S., House, L., Maiti, D., Endert, A., North, C.: Visual to Parametric Interaction (V2PI). PLoS One 8, e50474 (2013)
Li, R.C.: Solving secular equations stably and efficiently (1993)
Paurat, D., Gärtner, T.: Invis: A tool for interactive visual data analysis. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III. LNCS, vol. 8190, pp. 672–676. Springer, Heidelberg (2013)
Paurat, D., Oglic, D., Gärtner, T.: Supervised PCA for Interactive Data Analysis. In: Proceedings of the Conference on Neural Information Processing Systems (NIPS) 2nd Workshop on Spectral Learning (2013)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, B. (eds.) COLT/EuroCOLT 2001. LNCS (LNAI), vol. 2111, pp. 416–426. Springer, Heidelberg (2001)
Shearer, C.: The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing 5(4), 13–22 (2000)
Tenenbaum, J.B., Silva, V.D., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Tukey, J.W.: Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians, vol. 2, pp. 523–531 (1975)
Walder, C., Henao, R., Mørup, M., Hansen, L.K.: Semi-Supervised Kernel PCA. Computing Research Repository (CoRR) abs/1008.1398 (2010)
Weinberger, K.Q., Saul, L.K.: Unsupervised Learning of Image Manifolds by Semidefinite Programming. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 988–995 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oglic, D., Paurat, D., Gärtner, T. (2014). Interactive Knowledge-Based Kernel PCA. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44851-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-662-44851-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44850-2
Online ISBN: 978-3-662-44851-9
eBook Packages: Computer ScienceComputer Science (R0)