Data visualization playbook: Letting the data dictate the visualization

Data visualization, a great tool for finding patterns within data, can do more than just serve as graphical presentations. When utilized properlyvisualizations can be a tool to use for transforming data into a highly effective presentation and even uncover patterns.

As a demonstration of this tool, consider transforming multiple data tables into a single data visualization without losing a single piece of information. The case study presented here walks through a real-world scenario from a former client that highlights the importance of allowing the data to dictate the design and selection of data visualizations.

Including data in tables

Each year, an environmental grant-making foundation collected data on how the funding was distributed across the various environmental issues in each geographic region. At the end of the year, the organization analyzed the data and created a report about the distribution of these funds.

To ensure the transparency of the yearly report, the organization included a table with the amount of funding distributed to each issue area across geographic regions. Tables are a great way to present information neatly in a straightforward manner, but this solution tends to be less effective as the amount of data increases.

In the past, the organization would save time by including all the tables in an appendix, which required splitting the information across 12 pages in its glossy, full-color, printed annual report. This presentation was not only costly to print, but the end product offered little value to the reader because the task of comparing funding amounts among the different issue areas required flipping back and forth across the 12 pages.

Simplifying the data

Converting data tables into a matrix can be a simple yet powerful way to present the same information more effectively than tables alone. For this scenario, a matrix can be created to show the geographic region in each row, the environmental issue in each column and the grant amounts where the region and issues intersect.

This matrix includes the same information as the data tables, but the new format makes the data easier to view and compare across both geographic regions and environmental issues than the tables. As an added bonus, the matrix reduces the visualization from 12 pages to two pages, which can reduce the cost to print the report by over 80 percent.

While the benefits of converting data tables to a matrix are clear, the result hardly resembles what most data scientists would consider a data visualization. To bridge this gap and transform the matrix into a true data visualization, the matrix can be modified as a heat map in which light-to-dark shade gradations respectively represent low-to-high grant amounts.

Adding this additional level of complexity instantly transforms a simple matrix into a functional heat map. Data scientists need to consider applying these kinds of representations to create highly effective data visualizations.

Allowing the data to be your guide

As this case study example shows, simple solutions for presenting data can quickly become problematic when the size of the data set increases. Highly effective data visualizations require investigating the characteristics of the data and exploring the ways well-suited to leveraging these features for getting the message across to the people who will be reviewing and using the data.

Discover how the IBM advanced analytics portfolio can help you find patterns and derive insights by visually exploring data.