Authored By: Chenglong Wang, Yu Feng, Rastislav Bodik, Alvin Cheung, Isil Dillig

Visualization by Example

Jan 25, 2024

Visualization plays an essential role in today’s data-driven world for identifying, validating, and communicating insights derived from information. Despite a growing number of libraries developed to assist complex visualization tasks, data visualization still remains a challenging task that requires considerable expertise.

This includes having a sound understanding of how to use relevant visualization libraries and continuously reshaping data into different formats to experiment with different types of visualizations. Often, generating the intended visualization requires modifications to the original dataset including aggregating and mutating values and adding new columns to the input tables, often requiring deep knowledge in data manipulation.

Visual Program Synthesis

Here’s where visualization-by-example can help. It's a new technique for automating data visualization tasks using program synthesis. With this approach, users begin by providing a visual sketch—a partial visualization of the data for just a few input points. Then, using this original data set and the visual sketch provided by the user, a technique can be used to synthesize one or more visualization scripts whose output is consistent with the provided visual sketch. The scripts can then be applied to generate several visualizations of the entire data set and the user can choose the desired visualization among them.

Two-fold Visualization Approach

One of the key features of this approach is its ability to decompose an end-to-end visualization task into two separate synthesis problems over two different languages. Given an input data source and a visual sketch, the goal is to learn a table transformation program and a visual program.

This method infers an intermediate specification that constrains the output of the table transformation program and the input of the visual program. This intermediate specification is in the form of table inclusion constraints specifying that input to the visual program must include all tuples in the table but can also contain additional rows and columns. This is crucial for the scalability of the proposed approach.

Table Transformation Program

This paper introduces a new algorithm for synthesizing table transformation programs. This new table transformation algorithm can deal with different challenges such as generating a table from table inclusion specifications that do not provide precise information about the output table. To deal with this challenge, the algorithm uses lightweight bidirectional program analysis to prune the search space. This makes it especially effective in improving synthesis for automating visualization tasks.

Tool Implementation

To implement the visualization-by-example approach, Viser: a new tool was developed. It was evaluated on 83 visualization tasks collected from online forums and tutorials. The investigation showed that Viser could solve 84% of these benchmarks and, among those benchmarks that could be solved, the desired visualization is among the top-5 results in 70% of the cases.

Ultimately, effective data visualization is all about turning raw data into insightful images. By significantly simplifying the process of visualization generation and automating more routine tasks, technologies like Viser are poised to take us a step closer to the goal of fully automated data visualization.