Introduction

Overview

Teaching: 30 min
Exercises: 0 min
Questions
  • What is visualization?

  • What are common fallacies in terms of visualization?

Objectives
  • First objective.

Visualization: An introduction

“A picture can say more than a 1000 words”, is the old adagium. This also holds true for data: some large datasets which we work with, only reveal some of their secrets after carefully visualizing them.

The question is, of course, how to actually visualize data in practical terms. Also, the exact form of visualizations influence how useful they are.

Some famous examples:

Example

Depicting events Russian military campaign by Napoleon (1812-13). Can make information more clear. (https://www.edwardtufte.com/tufte/minard)

Example

Ease navigation: Tube map London (https://en.wikipedia.org/wiki/File:TubeMapZ1_TFL.png)

Example

Saving lives: tracing the source of a cholera outbreak in Soho, London (1854), map used to find cause of epidemic. Dr. John Snow (source: https://www1.udel.edu/johnmack/frec682/cholera/, https://www1.udel.edu/johnmack/frec682/cholera/snow_map.png).

Example

Provide insights into scientific data: climate change “hockey stick” graph (Michael E. Mann, CC-BY) https://upload.wikimedia.org/wikipedia/commons/0/0a/Mann_hockeystick.jpg

Visualizations

Let’s start by looking at the concept of ‘visualization’.

Defining Visualization

Cairo (2016)’s definition: “A visualization is any kind of visual representation of information designed to enable communication, analysis, discovery, exploration, etc.”

Davis (2009) distinguishes the following types of visualization:

Hence, the word ‘visualization’ encompasses a wide range of possible diagrams. In this workshop, we will not look at all these different types of visualizations, but mainly focus on (statistical) charts.

What is a chart, then?

Cairo (2016) defines it as such: “A chart is a display in which data are encoded with symbols that have different shapes, colors, or proportions.”

Common goals of visualization

Visualizations can be made for a wide range of purposes. It may reveal things that were not visible before, and can be used for exploratory data analysis. Potential goals are manyfold:

While there are many other potential goals for visualization, we mainly focus on exploratory data analysis and dissemination in this lesson.

Qualities of visualizations

Cairo (2016) suggests a number of qualities of visualizations (which are often not met in practice!):

The last point is especially important: statistics may not be (entirely) correct, and this applies more gravely to visualizations.

Fallacies of visualization

Designing an understandable and reliable visualization is far from straightforward. One the one hand, there is the importance of the origins, quality and scope of underlying data. It is essential to understand the whole picture, that is, how data is generated, if it is complete, or if it is a sample.

For instance, in this lesson we will be working with a dataset containing popular searches. It is important to ask oneself what “popular” means in this case.

This dataset, originating from Primo Analytics, contains up to 500 different query variations per month, as performed in a Norwegian university’s library discovery system. In the case of this university, this amounts to 5% of all searches done in the system.

Moreover, data processing is important, as each round of processing may influence what you see in the end (and this is even before starting to visualize data).

Cairo (2016) has summarized some of our inherent biases which have an effect on our judgement of visualizations:

Also during the creation of visualizations, we may introduce issues (consciously or unconsciously) due to bias or mistakes in interpretation. See Flowing Data for a summary of common ‘visualization lies’.

While there is always some interpretation involved in both creating and reading visualizations, we can try to keep in mind some of these issues and try to prevent them.

Read more

Key Points

  • Visualization comes in many forms and variations.

  • It is not straightforward to create reliable and engaging visualizations, and one has to keep in mind our cognitive biases.

  • When viewing visualizations made by others, always keep an eye open for ‘visualization lies’