Data Structure Heatmaps With Excel

I put up a casual post about a data visualization project a while back. Apparently a lot of people are interested in (or fighting with?) Excel for heatmaps, because it keeps getting a LOT of traffic.  Maybe I should capture that traffic and create a heatmap?  Humor.. 

The same project for the client-who-must-not-be-named continues.  They LOVE heat maps.  Every data set, they want a heat map. I've said over and over that it's only relevant when we're making a consistent single comparison, but OK, you wanna heatmap...

I was looking at a very high-level summary, showing pricing across 8 product lines, with 30 varying tiers, for 6 different regions.   (mumble mumble 30 x 6 = 180).  You wanna heatmap, we gotta heatmap (Select Range - Conditional Formatting - Color Scales). I suggest clicking that to see a larger version.

There are really two heatmaps here - the pricing in C:H, and a mirrored % set in J:O.  Just look at the macro-level color pattern - identical, right?

It bugged me a bit, though, because the range is just nuts - from $73.30 to $1,767.24.   Lots of stuff in the yellow middle.  It bugged me because the groups aren't necessarily interchangable, where the tiers might be.

So I tried something a little more elaborate: 

The punchline here, in terms of visual communication, is that this is the same data. If you can, look back and forth at the small versions of these pictures. It's a really different pattern. Wow, those Region 3 & Region 4 guys are good (green). What the heck is going on with Group E in Region 2, and Group F in Region 5. And so on, and so on.

My contention is, there's no way in heck to spot those comparisons by looking at the numbers. It's just not going to happen. Your brain isn't going to hang on to the relationships between 180 different numbers.

OK - quickly - HOW this was done with Conditional Formatting rules. First one - the whole range is a single comparison ("Rule"):

To make the second one - group-by-group - required selecting just the range for a group (e.g. C3:H3 for the first group), and applying conditional formatting for that range. Lather, rinse, repeat.

You can see the list of separate rules here:

Now, I suppose we could have an interesting argument about whether this is a good approach.  The color-coding between Groups is not absolutely consistent. $300 may come up RED in one group, and GREEN in another, and you aren't going to spot the really high prices for the set as a whole, or have a sense for which region or group is relatively low-priced or relatively high-priced. 

Well...yeah, that's why we did both heat maps, but let's leave that aside for a moment. 

The visual-communications question decision/question should be, does one of these color-coding approaches reveal more meaning in the data?  I think the second one does.

It's pretty simple - axiomatic, even.  Because we used additional structure in the data (the Group blocks), we revealed more about the data.  The one-big-block heatmap didn't access that structure, so of course it had nothing to say (visually) about that dimension of the data. 

FWIW I wouldn't have used the color-coding for the different groups (Columns A:B).  It detracts from the meaningful color-coding of the data.  Others' presentation decisions went into that choice. Oh well.

One non-obvious note - spending a little time getting consistent-sized columns, and finessing the gridlines, helps make this a more accurate mechanism for communication. 


PS auto-correct wants "heatmap" to be "heat map."  I don't buy it. Thoughts?