Sunday, August 26, 2012

and the winner is...

A big thank you to everyone who participated in the data viz challenge earlier this month (and thanks for your patience in awaiting this recap). As you may recall, the challenge was to help a philanthropic organization communicate a bunch of data about their various affiliates. If you're interested in a refresher on the details, you can find the challenge post with the full description here.

In this post, in addition to announcing the winner, I'll show a quick recap and my reactions to each of the submissions.

Submission 1: Peter Osbourne
You can view Peter's full description of his thought process in the comments of the post linked above. His main point was that, depending on the story one wishes to tell, a summary metric like averages may do the trick. Below is a snapshot of his workbook (he added the columns after the yellow one; full workbook can be downloaded here). In his comments, he makes a great point about figuring out what the story is first and then determining what data you have that best supports it (vs. putting together data and then trying to form the story).

Submission 2: Jon Schwabish
Jon decided on an interactive Excel graphic (download available here), which allows you to toggle across the various affiliates to get relevant detail on each. I really like the simplicity of the visual design used here. Great use of preattentive attributes in the line graph to make the blue line stand out from the others.

Submission 3: Lubos Pribula
Lubos continued the interactive Excel dashboard trend (downloadable here). I like the use of color to visually tie the line graph to the tabular data below (though we should be careful about the red-green color combination, which can be difficult for those who are colorblind). I also like the embedded bar charts within the tables at the bottom, which allow you to quickly visually compare aggregate measures across the various affiliates.

Submission 4: Gautham
Gautham created a dashboard in Tableau (if you don't have Tableau, you can download Tableau Reader here; Gautham's dashboard can be downloaded here). This dashboard allows you to view a single affiliate at a time and see a visual of their total assets in bars and number of gifts and grants via lines. This is useful if you want to compare the number of gifts and grants, or get a sense of the over time trends for a specific affiliate.

Submission 5: Rupert Stechman
Rupert took an unconventional approach to his data viz and went old school with pen and paper (which I love!) and created a sort of heatmap showing net change in assets over time by affiliate. Here's what he came up with (his blog post is here):

AND THE WINNER IS... Submission 6: Jeff Shaffer
Jeff created both a Tableau dashboard (downloadable here) and an Excel dashboard (pictured below; downloadable here). He doesn't win because he submitted dashboards in multiple forms, but rather because his visual is the one the foundation said they could see themselves using.

Here's what the philanthropic organization said: Thank you so much for trying to help us get a visual for our data. Your readers are much more skilled than I, and did some really interesting things with the data. I think Jeff Shaffer came closest to getting us something like what we need. His dashboard approach would be really useful in some instances."

Personally, I would have had a hard time choosing a winner (one reason I'm happy the philanthropic group made the decision for me!) - there are components I like from each of the visuals and I think each could work well, depending on what story you want to tell and who the audience is. This is a great reminder how important those pieces are - it's really difficult to create the perfect visualization without a good understanding of what story we want to tell and who we want to tell it to. We should absolutely spend time up front establishing that (and coaching our colleagues and clients to do so) before we create the supporting visual.

9/4 UPDATE: Jeff graciously put together a "how to" for creating the dashboard above, which you can download here.

Cole's non-competing submission
And I of course couldn't help but build my own visualization of this data as well. I did not go the interactive dashboard route, because the description made it sound like it was important to understand the trends for a given affiliate while also being able to compare those to other affiliates (hard to do in a dashboard that focuses on one affiliate at a time, though a couple of the above submissions address this in different ways). Here's a snapshot of what I came up with (I just show 4 here, but this approach continues for each of the affiliates; the Excel file is downloadable here):

Thanks, all, for playing (and Jeff, my offer stands to have you write a guest blog post if you're interested!). Let me know if you think I should pose challenges like this again in the future!

Wednesday, August 22, 2012

how long it takes to get pregnant

I love when data viz and life intersect. This happened for me recently, when I came across the following visualization - it's from a post a couple of months ago on flowing data.

How Long it Takes to Get Pregnant
Slightly modified from this post
The graph shows the odds of getting pregnant (y-axis) by the number of months one (or two as would typically be the case here) tries to get pregnant. The different colored markers denote the age (I assume of the female) trying to conceive. This shows that 25 year olds will nearly always get pregnant within a year of trying to conceive, and that this probability decreases the older you are.

How does this intersect life, you may ask? I had one empirical data point to add to the graph, denoted by the * at the (x, y) coordinate (5 months, 100%). Colored correctly, it would be somewhere between yellow and green.

For anyone who is still scratching their head to figure out what I'm talking about... 
I'm due in February!

Friday, August 10, 2012

evaluating word clouds

Word clouds created a bit of buzz when they first became popular a couple of years ago (or at least that's when I encountered them for the first time). Like the infographic, they have a bit of sex appeal that draws you in. As in the case of infographics, however, I often find that upon further evaluation they tend to be a letdown - full of fluff without so much informative value.

While facilitating a workshop recently, I heard a horror story about someone who had tried to create a word cloud by hand (perhaps the scariest part of the story involved scaling text boxes one at a time). Lesson: in data viz (and in life), if you find yourself doing something tedious and repetitive like that, stop to reevaluate. At minimum, do a Google search. Even better if you can find a blog post or related article on the topic from someone who has encountered the same challenge before and identified an eloquent solution.

In the case of word clouds, there are a number of applications you can use to generate them. Wordle is a popular free product (created by Jonathan Feinberg of IBM, note that if you upload your Wordle to the gallery, the data goes with it, though you can also opt for local-only word cloud generation) that allows for quite a bit of customization of color, size, font, etc. Google docs has a word cloud gadget within spreadsheets. There are a number of others, easily located via a Google search.

But before you start thinking about generating word clouds, let's continue our discussion on their efficacy. Their sexiness can draw you in. But is there value beyond that? I think it comes down to the use case. I've got one example for the negative and one for the affirmative.

Poor use of word clouds
First, let's take a look at an example from a Community Health Center. My understanding is that they employed a consultant to analyze some survey data from their clients. The consultant put together a report filled with pretty word clouds like this one:

Good service is... minutes? Part of the challenge in this case is that the connotation has been completely stripped away from the nouns, removing the sentiment behind the comments. Which is kind of the important part of the comments, in my opinion. But in reading the report, buried near the end of it, I found the following:

The consultants took the time to content-code the comments. These categories and their descriptions are much more useful for understanding what people value than the word cloud. With this info, we can direct action: we get an understanding of what's going well that we want to maintain, as well as potential areas for improvement. We could take this a step further of making the data visual like this:

In this case, I think the simple bar chart is much more useful (in terms of both understanding the information and determining how to act on it) than the word cloud. Now let's look at a better use of word clouds.

Thoughtful use of word clouds
Caveat: this example came to me by way of the telephone game (I heard it from someone who heard it from someone), which means it's guaranteed that I don't have the details totally right. But I think this still serves well as an example of a good use of word clouds. The story goes: Apple stores obviously really value customer service. They use surveys to collect info about each store. Each day, they create a word cloud for each store based on customer comments. What they are looking for are 5 (I'm making that number up, I don't know what the real number is) specific words - things that are considered must-haves when it comes to customer service in their stores. It's when these [5] words don't show up prominently on the word cloud for a given store that a red flag is raised and some sort of action is taken.

This is what I would consider a thoughtful and actionable use of word clouds. If the required word doesn't appear, some sort of intervention happens.

We can generalize this to the following: when you're considering using a word cloud, think about what you want your audience to know and what you want your audience to do. Then ask yourself if a word cloud will enable them to know and do those things.

And for goodness sake, if you do use a word cloud - leverage some of the tools that exist - don't try to create it by hand!

Friday, August 3, 2012

data viz challenge

A participant from a past workshop recently reached out to me hoping for help visualizing some data she wants to communicate about affiliate community foundations to members of those organizations and her board. Rather than keep all of the fun for myself, I thought I would put this challenge out to you.

Here's the description of the problem faced:

Basically, we serve as a host or parent organization for several smaller, local foundations throughout our service area. What we would like to show is how the money-in (gifts) compare to the total assets and money-out (grants) in a given year, and also allow the audience to compare their foundation to the other affiliates.

The data is contained in several separate tables (attached), and includes not only the values in $ but also the number of gifts, funds and grants. This information is not critical to the story, but if there's a way to include it (possibly by merging a table and graph, as I saw on your site) there are some specific cases where having this information will be useful. I can see where it will be hard to include it though, without being cluttered.

I'm open to whatever suggestions you have. I attached a pencil drawing of what I was picturing, but I'm not dead set on it being a certain way. I thought the format would be good because it allows you to visualize the relative size of the incoming to the outgoing for a given year, but also see how it changes over time, and how it compares to the other organizations, fairly easily. If there's a way to include the data labels in the bars or something, that would be good.

One major obstacle may be that the values vary so much (from something like $28 to almost $5 million). I don't know if this will make it impossible to do visually. It will be fine to drop the cents off the values.

[8/7 update: see comment below for additional context.]

Here's the sketch provided to start to get the creative juices flowing:

You can access the full data here.

Your challenge is this: come up with a straightforward way to visualize this data that will allow for the main desired comparisons outlined above by Wednesday, August 15th. When complete, you can leave a comment with a link to your visual, or email it to me directly ( along with any comments you'd like me to post with it, and I'll put it into Dropbox and create a comment for you with the link.

What will you win? A chance to help a philanthropic organization better communicate with data. Eternal notoriety. Oh, and I'll invite the creator of my favorite to write a guest blog post. Ready... set... GO!