Author:
Randy Goebel
Affiliation:
University of Alberta, Canada
Keyword(s):
Visualization, Induction, Data Abstraction, Picture Abstraction, Dimensionality Reduction, Data Picture Mapping, Domain Semantics.
Related
Ontology
Subjects/Areas/Topics:
Abstract Data Visualization
;
Computer Vision, Visualization and Computer Graphics
;
General Data Visualization
;
Information and Scientific Visualization
;
Visual Data Analysis and Knowledge Discovery
;
Visual Representation and Interaction
;
Visualization Taxonomies and Models
Abstract:
A picture results from a possibly multi-layer transformation of data to a visual vocabulary in which humans can draw inferences about the original data. The goal of this visualization process is to expose relationships amongst the data that are otherwise difficult to find, or only emerge by the process of the transformation. In case of the former kind of inference (confirming a relationship that did exist but was not obvious), visualization provides a kind of inferential amplifying effect. In the case of the latter (exposing new data relationships), visualization provides an inductive mechanism to create hypotheses not manifest in the original data.
In this regard, the creation of pictures from data is about data compression, which is naturally a kind of machine learning. Just as statistical concepts like average and standard deviation provide a measure on properties of a set of numbers, so too does visualization provide a kind of ``measure'' on data compressed to a visual vocab
ulary presented as a picture.
Our position is that visualization is about the (potentially multi-step, multi-layered) transformation of data to pictures, and that ever such transformation must make choices about what kinds of relations {\it to preserve}, and what kinds of data artifacts {\it to avoid} in each such transformation. Like a chain of formal inference, conclusions following from the end result (the picture) are determined by what each transformation in the inference chain is intended to accomplish. We argue that the visualization of large data sets, too large to inspect directly, requires a rigorous theory of how to transform data to pictures, so that the scientists as observers can be assured that inferences drawn from the pictures are either confirmable in the detailed data, or at least plausible hypotheses which can be further pursued by seeking further data (evidence).
(More)