Monday, May 30, 2011

storytelling with data: don't eat farmed fish

I spotted the following video at one of my favorite blogs, FlowingData. I consider it a great example of storytelling with data: straightforward, no extraneous information, attention-grabbing without being overly flashy, moderate use of stats to back up key points.

Eating Fish from Nigel Upchurch on Vimeo.

Personally, I avoid farmed fish on the basis of their health profile. The video calls out other fish, but farmed salmon are also often fed grains, rendering the fish fattier yet with a lower level of healthy omega-3 fats than their wild counterparts (farming also introduces things like antibiotics and pesticides that I'd rather not consume). As the video points out, there are sustainability concerns as well.

Saturday, May 28, 2011

information discovery: education & income by religion

A couple of weeks ago, the New York Times ran an article, Is Your Religion Your Financial Destiny?, which included the following infographic.

As my friend, Dave, who shared the article with me, pointed out - the color isn't necessarily informative, but it is an interesting way of carrying the gridlines across the page and certainly makes the visual eye-catching. I find the data it shows so very interesting.

Across the y-axis on the left, we see the percent of population with household income above $75K. The x-axis across the top shows percent that graduate from college. The dots plotted on the graph denote the various religions. You can quickly see that in general the higher the percent graduating from college, the higher the percent of >$75K income, which makes sense - one would expect positive correlation between education and income.

As my eyes scroll over the graphic, taking in the information, a couple of things make an impression. First, the wide spread: less than 10% of Jehovah Witnesses graduate from college (many don't finish high school, according to my mother's empirical evidence from the small sample she knows from her neighborhood) and their incomes reflect this. On the other end of the spectrum, over 70% of Hindus have a college degree. Also interesting: the places where the dots across the page do not follow a monotonically increasing line. For example, a greater proportion of Buddhists graduate college than Presbyterians, however a smaller proportion of Buddhist households have income over $75K compared to Presbyterians. The same phenomenon exists between Reformed Jews and Hindus. The Times refers to these anomalies as "less affluent than they are educated" and points to cultural influences and possibly discrimination as the root cause.

This isn't a case where the infographic is meant to highlight a single takeaway or recommendation; rather, it invites the audience to explore and draw their own conclusions: a tool for information discovery. Happy exploring!

Tuesday, May 24, 2011

secondary y-axis

The question of the secondary y-axis comes up every time I teach a data visualization course. As you've probably deduced by now, my general bias is for ease of interpretation. The challenge with adding a second y-axis is that it's not always clear which data series belongs to which axis. That said, I think there are ways to do this that get over this hurdle.

Earlier today, I was reading an article from the latest McKinsey quarterly shared by a colleague (thanks Andrew, if you're reading this!). The topic: big data. Of particular interest to me, given that I work in an analytical field, was the forecast shortage of analytical talent to make use of the growing world of data, something McKinsey raised in the article as a competitive advantage. In addition to interesting content, there were a number of graphs included in the article - some good, some not so good. It was one of these less than stellar views that acted as the impetus to this post: the following.

I think this visual is mostly pretty good. The takeaway is described clearly and the graph reinforces it visually.

But it could be a little better.

My one gripe, as you can perhaps anticipate, is with the blue circles showing Persons Unemployed across the bottom. Because the blue of the circles is darker than the bars, your eyes are drawn there by a visual cue that says "pay attention to this, it's the most important", whereas I'm not sure that's the case. Based on the takeaway at the top, the decrease in spending and lack of increase in unemployment seem to be of roughly equal importance. So let's make them look that way in the visual. Also with the numbers embedded in the circles, you have to read them and think about what they mean more than you would need to if this were shown visually.

My thought when I looked at this was that both of these issues could be solved by use of a secondary y-axis. Here's my makeover:

As mentioned, the main challenge with a secondary y-axis is making it clear which series belongs to which axis. One way around this is to not show the second y-axis, but rather label the series directly. In this case, I actually didn't show either axis.

What's your view on the secondary y-axis? Should it be embraced? Verboten? Leave a comment with your thoughts.

Saturday, May 7, 2011

CEP chart redesign

A few weeks ago, the Center for Effective Philanthropy (CEP) asked for some data visualization help via a post to their blog. Around the same time, they reached out to me after hearing about my recent presentation at the Grant Managers Network Conference. 

The CEP provides foundations and other philanthropic organizations with comparative data to understand how they perform across various aspects compared to other foundations. Their challenge is presenting this rich comparative data back to their audience in a straightforward fashion. Here is an example of the visual they've been using:


In response to feedback they have received that the chart is difficult to understand, the CEP decided to revisit the design. Their requirements for the visual is as follows; the chart must:
  • Be flexible enough to display segmentation of overall data and trend data;
  • Simultaneously display both an absolute scale and relative results; and
  • Display comparative context so that one funder can consider its relative results compared to the database of others' results.

This was a challenge I was excited to be enlisted to help solve. I had two reactions to the current CEP visual:
  1. It's complicated. A lot of information is being presented on this graph. Because it looks complex, it may turn off some of the audience. Also, there are many different comparisons that can be made with the visual, which can be overwhelming. My recommendation: reduce the amount of information in the visual; prioritize the key pieces.
  2. There's a lot to remember. On average, people can keep about 4 pieces of information in their short term memory at a given time. With so many different shapes and colors, the reader constantly has to refer back and forth between the graph and the legend on the right to decipher the graph. My recommendation: label graph directly to reduce the interpretation burden.

My first approach was to preserve the fundamental design of the chart, but streamline it by eliminating items that don't add informative value and reducing the number of comparisons in the main visual. I ended up with the following.


I wasn't particularly happy with this makeover. Though the visual has been simplified (and second-order comparisons moved to the table below the graph), it still looks complicated. I'm afraid we may lose some of the audience with this visual: like the original, it is intimidating. I decided to try a completely different approach. Here's what I came up with:


Bar charts are easy for people to read. They are common, which means no learning curve for the audience to get at the information. People's eyes can easily see the difference between the end points of the bars, making comparisons between values easy.

I shared this redesign with the CEP. While they liked the straightforward approach, they worried that it underemphasized the comparative data, which is the core of their value proposition. Based on this feedback, I tweaked the redesign to the following.


This design shifts the emphasis and focus to the comparative values by drawing the audience's eye with color. The vertical lines at the cohort funders and all funders median values make for an easy comparison between a given foundation's trend over time and programs and the CEP's comparative data.

The CEP had an advisory committee meeting yesterday, where they reviewed my redesigns as well as some alternate views that they created in house. I'm very interested to hear what approach they will take!

Leave a comment if you have feedback on the redesign, or ideas on alternate ways of visualizing this data.

Tuesday, May 3, 2011

cool real time data capture and display

Today's NYT online features an interesting interactive visual on the primary US news topic of the moment - the death of Osama Bin Laden. The Times poses two questions: How much of a turning point in the war on terror will Bin Laden's death represent? and What is your emotional response?

Readers can indicate how they feel about these questions along a scale ranging from insignificant to significant (y-axis) and from negative to positive (x-axis). In addition to picking a position, the audience can input comments, which pop up on the visual as new content is added.

I love that this doesn't pick a position, but instead lets users generate the content. At the time of my blogpost, the upper right quadrant (positive, significant) is the most densely populated (and I imagine this will continue to be the case). A quick read through some of the comments (possible by mousing over the cells) shows what is often true: the outliers are as interesting (in some cases more so) as the predominant trend.

What other ways could we use interactive real time visuals like this? Leave a comment with your thoughts.