Do We Need a Science of Data Visualization Storytelling?

May 6, 2013 Paul M. Davis

Still from Alex Lundry's "Chart Wars" presentation.

Still from Alex Lundry’s “Chart Wars” presentation.

At the Harvard Business Review, DataKind’s Jake Porway warns data storytellers against the temptation to use data visualization as a tool to mislead viewers. The most egregious example of this practice are the notoriously biased infographics deployed by cable news networks, yet more subtle techniques are far more insidious, muddying the picture while retaining a patina of credibility.

In his essay, Porway cites “Chart Wars”, a presentation by data scientist Alex Lundry that demonstrated how Republicans and Democrats used subtle signifiers in their infographics to serve their respective agendas. Porway writes:

[Lundry] showed a Republican visualization of the House Democrats’ health plan — an infographic full of sinuous pipes and literal red tape, smattered with ugly unreadable fonts and unwelcoming 8-bit color palettes — next to a Democratic visualization of the same plan, which instead looked like an Easter basket, a perfectly designed and welcoming bundle of pastel circles.

You could call it dog whistle data viz — exploiting visual cues and misconceptions about the empirical value of a data set to subtly advance an agenda, or confirm existing preconceptions. The presumption that data holds some unassailable credibility makes such chicanery dangerous, Porway argues:

The most troubling part of all this is that “we the people” rarely have the skills to see how data is being twisted into each of these visualizations. We tend to treat data as “truth,” as if it is immutable and only has one perspective to present…If someone uses data in a visualization, we are inclined to believe it…We don’t see in the finished product the many transformations and manipulations of the data that were involved, along with their inherent social, political, and technological biases.

Porway recommends a number of strategies for data storytellers with integrity. He makes the case for providing multiple views of the data to so viewers may draw comparisons, understand the context of the information, and understand multiple perspectives. A good example of this, he says, is a New York Times interactive feature on the September 2012 jobs report which offered multiple views, accounting for how the data might be interpreted differently given individuals’ political leanings, as well as access to the raw data.

The New York Times interactive feature on the September 2012 jobs report.

The New York Times interactive feature on the September 2012 jobs report.

These are worthy prescriptions, but data visualization thrives upon experimentation and innovation. The sources and amount of information, as well as the types of stories we tell, are constantly changing and expanding. Dictating a single rubric or set of best practices isn’t sufficiently expansive to account all of the potential techniques or use cases. In a recent blog post, media consultant Nick Diakopoulos argues for “a science of data-visualization storytelling.” He writes, “We need some direction. We need to know what makes a data story ‘work’. And what does a data story that ‘works’ even mean?”

As data visualization evolves, the critical dialogue around its uses is similarly maturing. While Diakopoulos praises such critiques, he makes the case for applying data science to the work of data scientists:

Critical blogs such as The Why Axis and Graphic Sociology have arisen to try to fill the gap of understanding what works and what doesn’t. And research on visualization rhetoric has tried to situate narrative data visualization in terms of the rhetorical techniques authors may use to convey their story. Useful as these efforts are in their thick description and critical analysis, and for increasing visual literacy, they don’t go far enough toward building predictive theories of how data-visualization stories are “read” by the audience at large.

Diakopoulos cites researcher Robert Kosara, who articulates the challenges inherent in such an effort, as well as why it matters: “there are no clearly defined metrics or evaluation methods,” Kosara states. “Developing these will require the definition of, and agreement on, goals: what do we expect stories to achieve, and how do we measure it?” Without such metrics and analysis, practitioners can’t decisively learn from successes or mistakes, or understand the value and impact of their work. Diakopoulos asks, “What does a data story that ‘works’ even mean?” How do practitioners measure viewers’ engagement, insight, and understanding?

It’s an open and difficult question, but Diakopoulos makes a handful of tentative suggestions for measuring impact. “We might for instance ask questions about how effectively a message is acquired by the audience,” he writes. “Did they learn it faster or better? Was is memorable, or did they forget it 5 minutes, 5 hours, or 5 weeks later?”

Beyond those basic metrics, he states, are questions to measure impact, effectiveness, and the ethical role of data stories. Does the work inspire personal insights or questions within the audience? Do they feel they could make informed decisions from the information provided? Do viewers consider some data stories and sources more credible than others, and what influences that perception?

About the Author


Stop leaking ActiveRecord throughout your application
Stop leaking ActiveRecord throughout your application

Extending ActiveRecord::Base leaks a powerful API throughout an application which can lead to tempting code...

Access your database's best features with Sequel
Access your database's best features with Sequel

Sequel is a wonderful library for interacting with relational databases. Some of my favorite aspects: 1. Ou...