Friday, June 29, 2012

drawing attention with data labels

I am a firm believer that data is inherently interesting. When you find the right story to tell with the data, that is. Graphing applications, unfortunately, don't know our data or what stories to tell with it. So, while it's easy to put your data into a chart and feel like you're done, this is a disservice to both your mission and your data.

The lessons in this post are two-fold. We're going to focus on an anonymized example from a recent workshop I conducted and discuss teasing the story out of the data and producing a visual that better tells this story, using data labels to help draw our audience's attention to where we want it.

Here is the visual we'll begin with:

Ticket Volume Over Time
I'm going to implore you to resist the urge to scroll immediately downward and instead concentrate on the visual above for a moment. What story could we tell with this data?

It takes some time staring at the data in its current format, but if we take the time to do so, one thing to note is that the volume processed vs. received are close to one another during the first part of the year, but we start to see separation as we continue through the latter part of the year, with volume processed lagging the volume of tickets received. That's the beginning of a story.

When plotting multiple series over time, bars tend to quickly get visually overwhelming. Lines can often show trends over time in an easier to consume fashion, so let's start by seeing what this same data looks like in a line graph:

Note that in addition to changing the chart type, I've done a couple other things to make the visual above easier to interact with. The series are labeled directly, eliminating the work of going back and forth between a legend and the data to understand what I'm looking at. I also pushed the axis lines and labels to the background by making them small and grey, so they are there for reference, but don't compete visually for attention with my data.

In the above, I eliminated the data labels altogether. But bear with me a moment while I add them back:

Labeling every data point creates a cluttered visual, one of the issues with the original graph. But I think we can use them here in a way that will add value. The data labels act like added marks (a preattentive attribute) that draw our attention. The problem with labeling every point in this case, though, is that our attention isn't drawn anywhere except to the lines that were already drawing it in the first place. But take a look at what happens as we play with which points we label:

In this case, we're drawn more to the right side of the graph because of the additional visual pull of the data labels. But this still looks a little cluttered to me, so I'm going to remove a couple more labels:

Bingo. I'm drawn to the part of the graph where the lines really start to diverge: forcing my audience fo focus visually on the part of the graphic that really tells the story. Now that I've got a visual that can be used to tell the story I want to tell, it's time to put the words down on this page to actually tell that story. This is when we need to take the context we know about the situation that the audience needs to know and figure out how to make our story compelling.

I don't actually know the context here, but I can use the data to start to make observations that will lead to questions that would help me figure it out. When I look at the data, I find it interesting that the incoming ticket volume was higher at some points earlier in the year, and yet we were able to keep up with it then, whereas in the latter part of the year we are falling short. This suggests that something changed. Perhaps there was attrition from the team that processes the tickets. Or perhaps a process or systems change took place that meant the sort of tickets coming in during the latter part of the year were more difficult and took longer to resolve than earlier in the year. Whatever the context is, we need to explain it. 

The final visual could look something like this:

Note that this isn't the only story we could have focused on to make a compelling argument for the conclusion: we need more resources. We could have instead focused on the growing ticket backlog over time, for example. One reason that I didn't do that here was lack of data: since I only had the data in the chart starting in January, I didn't have any indication of whether a backlog existed prior that would be important to consider. 

In almost every case, there are multiple different stories you could tell or ways to tell the story (or show it visually) that will get the message across. But the compelling stories don't suddenly materialize when we plot our data for the first time. Rather, it takes spending time on this piece to ensure you aren't just showing data for the sake of showing data, but rather that it's for a specific purpose, with a fleshed out story that your visual reinforces. That's storytelling with data.

If you're interested in the Excel file for the above progression, click here.

Sunday, June 17, 2012

the power of simple text

When we think of showing data, typically our brains go first to tables and graphs. But let's not forget about simple text. When you have just a number or two, showing the numbers themselves can be much more powerful than burying them in a table or graph: beyond potentially misleading by doing so, putting just a couple numbers into a table or graph causes them to lose some of their "umph".

For a quick case in point, let's look at an example. Let's say we just surveyed our users on whether they'd like to see us make changes to our services. There are a number of ways we could visualize the responses to these questions. In Excel, we might end up with something like this:

Not surprisingly, I wouldn't recommend a pie here (or in general). But rather than the horizontal bar chart you see me  so often replacing them with, in this situation I'd recommend skipping your graphing application altogether and opting for simple text. Perhaps something like the following:

Note how the pie chart underemphasizes the piece we want to focus on, since the 9% who responded yes are dwarfed by the 88% who don't see a need for change. When we use simple text on the other hand, the number 9% can be emphasized with preattentive attributes (color, size) to make it clear both that it's important and the focal point of our story.

Wednesday, June 6, 2012

visualizing everyday life

The data visualization in my life is primarily in the business-world. At my day job: how do we ensure that people decisions at Google are data-driven? In my presentations and workshops: who is our audience, what do they need to know, and how do we craft a visual and story to do that?

But many take data visualization into the personal sphere as well: using visualization to better understand aspects of their world or their life. I encountered one such example recently, when a data viz course participant at Google shared an example he created:

"Hi all,  Here is silly little thing I cooked up over the weekend. My wife likes fresh tomatoes, of what are called heirloom varieties (not the big commercial ones) - 16 different ones each year in our garden. We used to have trouble selecting which ones to grow each time, for the last 4 years have kept pretty good records of them, so I wanted to see if there were any patterns.

This is my first such chart after taking the basic data viz class, where I had a chance to sit and think about how to make it look. I did violate the color palate guidelines a bit, to color code each tomato by type. But this makes the type of tomato stand out, as well as the pattern."

Neil goes on to say, "Interestingly enough, until I graphed it, I didn't know that we rarely have a yellow tomato invited back a second year. Our by year lists (stored on a wiki at home) tended to mask that information." I love the use of data viz for this sort of problem solving: what type of tomatoes should I plant this year? I think Neil's next challenge will be to identify and start recording and visualizing some success measures (e.g. plant yield, flavor) to really hone his future garden crops.

This reminded me of another food-related data viz I saw some time ago, where a woman had tracked everything she ate for a year, then created a number of visualizations based on the data. You can read about that and see the visuals in this Flowing Data post.

Food for thought (no pun intended!): what do you (or could you) visualize in your life?

Friday, June 1, 2012

telling multiple stories (part 2)

Last week, we looked at an example of telling multiple stories with the same data in a single visual. Today, I want to look at another example where we'll repeat the same visual, drawing attention to different parts of it to tell discrete stories.

Here is our base visual:

Most of the details have been hidden here to preserve the confidentiality of the data (note also that there was originally a y-axis on the second graph, which I've removed here to protect the confidentiality of the info shown). Let's imagine that we have a number of different categories and data for each on 1) what proportion of users take a certain action and 2) how satisfied users are with the outcome of that action in a few countries.

As in the example discussed last week, there are a number of comparisons we can make with this data: for a given category, we can see how it compares to the other categories or how the various countries compare within a given category; or we can look at how the metric varies across different categories for a given country.

In the above view, these comparisons are mostly equally easy (or difficult, depending on how you look at it). There aren't strong visual cues directing our attention (outside of proximity, which makes it easier to compare categories or countries within a given category, but harder to compare a give country across multiple categories).

But check out what happens when we add those visual cues. First, to tell a story about a given category:

We can also use this approach to tell a story about a given country:

I think this is a particularly powerful way to approach telling multiple stories with data when presenting to a live audience, because it really allows you to pull their attention to where you want it as you talk through the interesting findings. But I could imagine a similar approach in a written report as well. There is definite benefit to be gained by repeating the visual: the audience orients themselves with it once and the details stay the same, just the point of emphasis and context you build around it with the story changes.