Friday, July 24, 2015

align against a common baseline

Registration for upcoming storytelling with data public workshops in NYC and Los Angeles is currently open here. Stay tuned for details on fall sessions to be scheduled in Seattle and SF.

I've been failing when it comes to staying up with reading and posting on data visualization related stuff lately (my focus has been elsewhere). But I found myself with a few spare minutes yesterday afternoon and decided it was time to change that.

The first article in my Feedly was by FiveThirtyEight and the graph that appeared with it caused me to click for details. Here's the graph that caught my attention:

I like FiveThirtyEight's general approach when it comes to data visualization: straightforward and clutter free, with emphasis on the story. My view is that the graph should never be what makes the data interesting, rather it's the story that makes the data interesting. They seem to subscribe to this view as well.

In this case, the story is called out clearly at the top: Being Arrested Is Deadlier For African-Americans.

The accompanying visual is fine. But I think it can be made better by adhering to one recommendation I find myself often voicing to workshop participants: Think about what you want your audience to be able to easily compare. Put those things as physically close together as you can and align them along a common baseline.

With the current view, it's easiest to compare deaths for Whites by cause and, separately, deaths for African-Americans by cause. Yes, we can see (and read) that the yellow bars on the right are bigger than the red bars on the left (the point called out in the title), but note the bouncing back and forth your eyes do when comparing the bars across the two graphs. It's also hard to judge how much longer the yellow bars are vs. the red ones. Sure, we have the numbers there to help, but this means we have to do some mental math to decipher the differences. Why go through this work, when we can restructure the visual to avoid it?

To make it easier to compare deaths by cause for African-Americans vs. Whites, we can align both series along a common baseline. Here's what that looks like:

I made a few additional minor changes in this remake. The original graphs weren't monotonically decreasing in order of either White or African-American death cases (not sure why), so I changed the ordering of the data here so it would be, ordering by decreasing cause of death for African-Americans (there should always be logic in the way you order your data). Where there was space, I pulled the data labels into the bars to reduce the visual clutter. I pulled the subtitle instead into the x-axis label so that the words are right next to the data they describe. I didn't like the bold colors in the original visual, so stripped color out of my remake entirely. (If you do want to use color here, I'd suggest different shades of the same color - red and yellow together are both so bright that it makes it hard to focus on one or the other).

Another potential alternative with this data would be to use a slopegraph. Or so I thought. But I quickly abandoned this approach: there are too many criss-crossing lines at the lower values to allow space to label the data effectively. The following is what it looked like (note I didn't spend any time on the formatting or labeling once I realized this approach wouldn't work; if you're interested in seeing a completed example of a slopegraph in practice, check this out).

Also, while I love the idea of slopegraphs for group comparisons, in practice I've had mixed responses. Slopegraphs can be a little less intuitive than bars for data like this. It's also important to note that the slopegraph makes it easier to focus on the difference (via the lines connecting the various points), whereas bars make it easier to focus on absolute values. In this case, even if the data values had been such that the slopegraph would have worked, I think I still would prefer the bars when it comes to supporting the story that overall deaths per 100,000 arrests are higher for African-Americans compared to Whites.

Meta-point: align the things you want your audience to compare along a common baseline!

To download the Excel file containing the graphs above, click here.


  1. Thanks Cole! I continue to appreciate your ability to make the simple even simpler to represent the data

  2. Nice. I like how you combined the two into one. It really brings out the contrast into focus.