Friday, July 24, 2015

align against a common baseline

Registration for upcoming storytelling with data public workshops in NYC and Los Angeles is currently open here. Stay tuned for details on fall sessions to be scheduled in Seattle and SF.

I've been failing when it comes to staying up with reading and posting on data visualization related stuff lately (my focus has been elsewhere). But I found myself with a few spare minutes yesterday afternoon and decided it was time to change that.

The first article in my Feedly was by FiveThirtyEight and the graph that appeared with it caused me to click for details. Here's the graph that caught my attention:


I like FiveThirtyEight's general approach when it comes to data visualization: straightforward and clutter free, with emphasis on the story. My view is that the graph should never be what makes the data interesting, rather it's the story that makes the data interesting. They seem to subscribe to this view as well.

In this case, the story is called out clearly at the top: Being Arrested Is Deadlier For African-Americans.

The accompanying visual is fine. But I think it can be made better by adhering to one recommendation I find myself often voicing to workshop participants: Think about what you want your audience to be able to easily compare. Put those things as physically close together as you can and align them along a common baseline.

With the current view, it's easiest to compare deaths for Whites by cause and, separately, deaths for African-Americans by cause. Yes, we can see (and read) that the yellow bars on the right are bigger than the red bars on the left (the point called out in the title), but note the bouncing back and forth your eyes do when comparing the bars across the two graphs. It's also hard to judge how much longer the yellow bars are vs. the red ones. Sure, we have the numbers there to help, but this means we have to do some mental math to decipher the differences. Why go through this work, when we can restructure the visual to avoid it?

To make it easier to compare deaths by cause for African-Americans vs. Whites, we can align both series along a common baseline. Here's what that looks like:


I made a few additional minor changes in this remake. The original graphs weren't monotonically decreasing in order of either White or African-American death cases (not sure why), so I changed the ordering of the data here so it would be, ordering by decreasing cause of death for African-Americans (there should always be logic in the way you order your data). Where there was space, I pulled the data labels into the bars to reduce the visual clutter. I pulled the subtitle instead into the x-axis label so that the words are right next to the data they describe. I didn't like the bold colors in the original visual, so stripped color out of my remake entirely. (If you do want to use color here, I'd suggest different shades of the same color - red and yellow together are both so bright that it makes it hard to focus on one or the other).

Another potential alternative with this data would be to use a slopegraph. Or so I thought. But I quickly abandoned this approach: there are too many criss-crossing lines at the lower values to allow space to label the data effectively. The following is what it looked like (note I didn't spend any time on the formatting or labeling once I realized this approach wouldn't work; if you're interested in seeing a completed example of a slopegraph in practice, check this out).



Also, while I love the idea of slopegraphs for group comparisons, in practice I've had mixed responses. Slopegraphs can be a little less intuitive than bars for data like this. It's also important to note that the slopegraph makes it easier to focus on the difference (via the lines connecting the various points), whereas bars make it easier to focus on absolute values. In this case, even if the data values had been such that the slopegraph would have worked, I think I still would prefer the bars when it comes to supporting the story that overall deaths per 100,000 arrests are higher for African-Americans compared to Whites.

Meta-point: align the things you want your audience to compare along a common baseline!

To download the Excel file containing the graphs above, click here.

Wednesday, June 3, 2015

audience, audience, audience

I sometimes feel a little like a broken record when I talk about communicating with data. My latest oft-repeated word is audience. We must keep our audience in mind throughout the design process and in general, try to make things easy on them. I spent a little time on this topic in a webinar for TechChange yesterday and thought I'd turn some of my notes into a quick blog post, which is what you'll find below.

When it comes to audience, I often have workshop participants do an exercise where I encourage them to identify a specific person they are communicating to. While it isn't always the reality, designing with a specific person in mind can help us from falling into the "mixed audience" trap. If you are communicating to a "mixed audience," it's easy to treat them as a glob and not recognize that the mixed group is made up of individuals. In fact, it's surprisingly easy to make a data visualization (or the broader communication in which a data visualization sits) without ever pausing to think about the person on the other end of it. When it comes to communicating with data, my view is that we should not design for ourselves or our work or project. Rather, we should design for our audience. Always.


One benefit of identifying a specific audience is that doing so allows you to reflect on who they are and what drives them. What do they care about? What motivates them? What keeps them up at night? This is helpful for structuring your overall message in a way they will be receptive to. If you can identify what motivates your audience, you can think about how to frame what you need them to know or do in terms of those motivating factors, improving your odds for successful communication.

Beyond that, there are important things to know about how your audience sees that you can use to your advantage when creating visuals. These are the lessons I've more traditionally focused on in my workshops and here on this blog. Identify and eliminate clutter or things that aren't adding informative value. Leverage preattentive attributes like color, size, and placement on page to signal to your audience where to look and create a visual hierarchy of information. (I already sound like a broken record on many of these topics, so won't repeat more of that here today!)

In the Q&A portion of any workshop or presentation, my broken record player of audience, audience, audience tends to run on repeat Considering our audience can help us answer many of the design questions we face: What colors will work well? When does enough information become too much information? Will an image or video be appropriate? When do I need to add more context or explain in greater detail? When you find yourself facing questions like these, pause to consider your audience. Who are they and what will work best for them?

Meta-lesson: keep your audience in mind throughout the design process; designing with them in mind will set you up for success when communicating with data!

Friday, May 29, 2015

dogfood with data

My husband and I were watching TV one evening last week. One commercial caught my attention. It was a commercial for Eukanuba dog food.

I do not have a dog.

Still, there was something about the combination of music and video and text with a bit of data that left an impression. As a side note, I find it very interesting that because dogs have shorter lives, life-long studies are possible in a much shorter timeframe than for humans.

When I was searching for the commercial today (more than a week since seeing it), I did not remember the specific stats. But I did remember the message: their study showed dogs treated well on a diet of Eukanuba live longer.

I often get asked about the inclusion of pictures and videos when it comes to presentations in general. For me, the thing to think about is whether that picture or video will help you make your point and help that point stick with your audience.

Along those lines, I find this commercial to be an excellent example of storytelling with data. Enjoy!