Introduction

Overview

Teaching: 30 min
Exercises: 0 min

Questions

What is visualization?

What are common fallacies in terms of visualization?

Objectives

First objective.

Visualization: An introduction

“A picture can say more than a 1000 words”, is the old adagium. This also holds true for data: some large datasets which we work with, only reveal some of their secrets after carefully visualizing them.

The question is, of course, how to actually visualize data in practical terms. Also, the exact form of visualizations influence how useful they are.

Some famous examples:

Depicting events Russian military campaign by Napoleon (1812-13). Can make information more clear. (https://www.edwardtufte.com/tufte/minard)

Ease navigation: Tube map London (https://en.wikipedia.org/wiki/File:TubeMapZ1_TFL.png)

Saving lives: tracing the source of a cholera outbreak in Soho, London (1854), map used to find cause of epidemic. Dr. John Snow (source: https://www1.udel.edu/johnmack/frec682/cholera/, https://www1.udel.edu/johnmack/frec682/cholera/snow_map.png).

Provide insights into scientific data: climate change “hockey stick” graph (Michael E. Mann, CC-BY) https://upload.wikimedia.org/wikipedia/commons/0/0a/Mann_hockeystick.jpg

Visualizations

Let’s start by looking at the concept of ‘visualization’.

Defining Visualization

Cairo (2016)’s definition: “A visualization is any kind of visual representation of information designed to enable communication, analysis, discovery, exploration, etc.”

Davis (2009) distinguishes the following types of visualization:

Statistical visualizations e.g. Supreme Court Justices
Infographics e.g. An internet minute
Maps e.g. New York Times immigration explorer
Network visualizations e.g. Social network analysis visualization
Artistic visualizations (“data as art”) e.g. “Forest of Numbers”

Hence, the word ‘visualization’ encompasses a wide range of possible diagrams. In this workshop, we will not look at all these different types of visualizations, but mainly focus on (statistical) charts.

What is a chart, then?

Cairo (2016) defines it as such: “A chart is a display in which data are encoded with symbols that have different shapes, colors, or proportions.”

Common goals of visualization

Visualizations can be made for a wide range of purposes. It may reveal things that were not visible before, and can be used for exploratory data analysis. Potential goals are manyfold:

Analyzing data, often exploratory “Graphing data needs to be iterative because we often do not know what to expect of the data; a graph can help discover unknown aspects of the data, and once the unknown is known, we frequently find ourselves formulating new questions about the data.” (Cleveland 1985, as cited in Spence, 2001)
Disseminating results of analysis, for instance in a news article or scholarly paper
Decision making, taking actions based on evidence in the data
Conveying a message, for instance social issues

While there are many other potential goals for visualization, we mainly focus on exploratory data analysis and dissemination in this lesson.

Qualities of visualizations

Cairo (2016) suggests a number of qualities of visualizations (which are often not met in practice!):

Functional It should depict data accurately, but also be useful to people
Beautiful A visualization should be ‘attractive’ to different audiences
Insightful It should reveal evidence that we could have missed without the visualization
Enlightening A visualization may “change our minds” (hopefully for the better…)
Truthful A visualization should depict truthful and honest research

The last point is especially important: statistics may not be (entirely) correct, and this applies more gravely to visualizations.

Fallacies of visualization

Designing an understandable and reliable visualization is far from straightforward. One the one hand, there is the importance of the origins, quality and scope of underlying data. It is essential to understand the whole picture, that is, how data is generated, if it is complete, or if it is a sample.

For instance, in this lesson we will be working with a dataset containing popular searches. It is important to ask oneself what “popular” means in this case.

The popular searches dataset

This dataset, originating from Primo Analytics, contains up to 500 different query variations per month, as performed in a Norwegian university’s library discovery system. In the case of this university, this amounts to 5% of all searches done in the system.

Moreover, data processing is important, as each round of processing may influence what you see in the end (and this is even before starting to visualize data).

Cairo (2016) has summarized some of our inherent biases which have an effect on our judgement of visualizations:

Patternicity “Detecting interesting patterns, regardless of whether or not they are real”
Storytelling Trying to find cause-effect relationships for patterns we observe
Confirmation Confirming our own beliefs (cognitive dissonance, confirmation bias)

Also during the creation of visualizations, we may introduce issues (consciously or unconsciously) due to bias or mistakes in interpretation. See Flowing Data for a summary of common ‘visualization lies’.

While there is always some interpretation involved in both creating and reading visualizations, we can try to keep in mind some of these issues and try to prevent them.

Tufte (1983). The Visual Display of Quantitative Information
Cairo (2016). The Truthful Art - Data, Charts, and Maps for Communication.
Visualization examples: informationisbeautiful.net/
Article on visualization in a library context (Davis, 2009): inthelibrarywiththeleadpipe.org/2009/not-just-another-pretty-picture/

Key Points

Visualization comes in many forms and variations.

It is not straightforward to create reliable and engaging visualizations, and one has to keep in mind our cognitive biases.

When viewing visualizations made by others, always keep an eye open for ‘visualization lies’

lesson home

Visualization for Librarians

next episode

Introduction

Overview

Visualization: An introduction

Some famous examples:

Visualizations

Defining Visualization

What is a chart, then?

Common goals of visualization

Qualities of visualizations

Fallacies of visualization

The popular searches dataset

Read more

Key Points

lesson home

next episode