Storytelling with Data

Home > Other > Storytelling with Data > Page 6
Storytelling with Data Page 6

by Cole Nussbaumer Knaflic


  Your audience (without other visual cues) will typically look at your visual starting at the top left and zigzagging in “z” shapes. This means they will encounter the top of your graph first. If the biggest category is the most important, think about putting that first and ordering the rest of the categories in decreasing numerical order. Or if the smallest is most important, put that at the top and order by ascending data values.

  For a specific example about the logical ordering of data, check out case study 3 in Chapter 9.

  Stacked horizontal bar chart

  Similar to the stacked vertical bar chart, stacked horizontal bar charts can be used to show the totals across different categories but also give a sense of the subcomponent pieces. They can be structured to show either absolute values or sum to 100%.

  I find this latter approach can work well for visualizing portions of a whole on a scale from negative to positive, because you get a consistent baseline on both the far left and the far right, allowing for easy comparison of the left-most pieces as well as the right-most pieces. For example, this approach can work well for visualizing survey data collected along a Likert scale (a scale commonly used in surveys that typically ranges from Strongly Disagree to Strongly Agree), as shown in Figure 2.19.

  Figure 2.19 100% stacked horizontal bar chart

  Area

  I avoid most area graphs. Humans’ eyes don’t do a great job of attributing quantitative value to two-dimensional space, which can render area graphs harder to read than some of the other types of visual displays we’ve discussed. For this reason, I typically avoid them, with one exception—when I need to visualize numbers of vastly different magnitudes. The second dimension you get using a square for this (which has both height and width, compared to a bar that has only height or width) allows this to be done in a more compact way than possible with a single dimension, as shown in Figure 2.20.

  Figure 2.20 Square area graph

  Other types of graphs

  What I’ve covered up to this point are the types of graphs I find myself commonly using. This is certainly not an exhaustive list. However, they should meet the majority of your everyday needs. Mastering the basics is imperative before exploring novel types of data visualization.

  There are many other types of graphs out there. When it comes to selecting a graph, first and foremost, choose a graph type that will enable you to clearly get your message across to your audience. With less familiar types of visuals, you will likely need to take extra care in making them accessible and understandable.

  Infographics

  Infographic is a term that is frequently misused. An infographic is simply a graphical representation of information or data. Visuals coined infographic run the gamut from fluffy to informative. On the inadequate end of the spectrum, they often include elements like garish, oversized numbers and cartoonish graphics. These designs have a certain visual appeal and can seduce the reader. On second glance, however, they appear shallow and leave a discerning audience dissatisfied. Here, the description of “information graphic”—though often used—is not appropriate. On the other end of the spectrum are infographics that live up to their name and actually inform. There are many good examples in the area of data journalism (for example, the New York Times and National Geographic).

  There are critical questions information designers must be able to answer before they begin the design process. These are the same questions we’ve discussed when it comes to understanding the context for storytelling with data. Who is your audience? What do you need them to know or do? It is only after the answers to these questions can be succinctly articulated that an effective method of display that will best aid the message can be chosen. Good data visualization—infographic or otherwise—is not simply a collection of facts on a given topic; good data visualization tells a story.

  To be avoided

  We’ve discussed the visuals that I use most commonly to communicate data in a business setting. There are also some specific graph types and elements that you should avoid: pie charts, donut charts, 3D, and secondary y-axes. Let’s discuss each of these.

  Pie charts are evil

  I have a well-documented disdain for pie charts. In short, they are evil. To understand how I arrived at this conclusion, let’s look at an example.

  The pie chart shown in Figure 2.21 (based on a real example) shows market share across four suppliers: A, B, C, and D. If I asked you to make a simple observation—which supplier is the largest based on this visual—what would you say?

  Figure 2.21 Pie chart

  Most people will agree that “Supplier B,” rendered in medium blue at the bottom right, appears to be the largest. If you had to estimate what proportion supplier B makes up of the overall market, what percent might you estimate?

  35%?

  40%?

  Perhaps you can tell by my leading questioning that something fishy is going on here. Take a look at what happens when we add the numbers to the pie segments, as shown in Figure 2.22.

  Figure 2.22 Pie chart with labeled segments

  “Supplier B”—which looks largest, at 31%—is actually smaller than “Supplier A” above it, which looks smaller.

  Let’s discuss a couple of issues that pose a challenge for accurately interpreting this data. The first thing that catches your eye (and suspicion, if you’re a discerning chart reader) is the 3D and strange perspective that’s been applied to the graph, tilting the pie and making the pieces at the top appear farther away and thus smaller than they actually are, while the pieces at the bottom appear closer and thus bigger than they actually are. We’ll talk more about 3D soon, but for now I’ll articulate a relevant data visualization rule: don’t use 3D! It does nothing good, and can actually do a whole lot of harm, as we see here with the way it skews the visual perception of the numbers.

  Even when we strip away the 3D and flatten the pie, interpretation challenges remain. The human eye isn’t good at ascribing quantitative value to two-dimensional space. Said more simply: pie charts are hard for people to read. When segments are close in size, it’s difficult (if not impossible) to tell which is bigger. When they aren’t close in size, the best you can do is determine that one is bigger than the other, but you can’t judge by how much. To get over this, you can add data labels as has been done here. But I’d still argue the visual isn’t worth the space it takes up.

  What should you do instead? One approach is to replace the pie chart with a horizontal bar chart, as illustrated in Figure 2.23, organized from greatest to least or vice versa (unless there is some natural ordering to the categories that makes sense to leverage, as mentioned earlier). Remember, with bar charts, our eyes compare the end points. Because they are aligned at a common baseline, it is easy to assess relative size. This makes it straightforward to see not only which segment is the largest, for example, but also how incrementally larger it is than the other segments.

  Figure 2.23 An alternative to the pie chart

  One might argue that you lose something in the transition from pie to bar. The unique thing you get with a pie chart is the concept of there being a whole and, thus, parts of a whole. But if the visual is difficult to read, is it worth it? In Figure 2.23, I’ve tried to address this by showing that the pieces sum to 100%. It isn’t a perfect solution, but something to consider. For more alternatives to pie charts, check out case study 5 in Chapter 9.

  If you find yourself using a pie chart, pause and ask yourself: why? If you’re able to answer this question, you’ve probably put enough thought into it to use the pie chart, but it certainly shouldn’t be the first type of graph that you reach for, given some of the difficulties in visual interpretation we’ve discussed here.

  While we’re on the topic of pie charts, let’s look quickly at another “dessert visual” to avoid: the donut chart.

  With pies, we are asking our audience to compare angles and areas. With a donut chart, we are asking our audience to compare one arc length to another arc length
(for example, in Figure 2.24, the length of arc A compared to arc B). How confident do you feel in your eyes’ ability to ascribe quantitative value to an arc length?

  Figure 2.24 Donut chart

  Not very? That’s what I thought. Don’t use donut charts.

  Never use 3D

  One of the golden rules of data visualization goes like this: never use 3D. Repeat after me: never use 3D. The only exception is if you are actually plotting a third dimension (and even then, things get really tricky really quickly, so take care when doing this)—and you should never use 3D to plot a single dimension. As we saw in the pie chart example previously, 3D skews our numbers, making them difficult or impossible to interpret or compare.

  Adding 3D to graphs introduces unnecessary chart elements like side and floor panels. Even worse than these distractions, graphing applications do some pretty strange things when it comes to plotting values in 3D. For example, in a 3D bar chart, you might think that your graphing application plots the front of the bar or perhaps the back of the bar. Unfortunately, it’s often even less straightforward than that. In Excel, for example, the bar height is determined by an invisible tangent plane intersecting the corresponding height on the y-axis. This gives rise to graphs like the one shown in Figure 2.25.

  Figure 2.25 3D column chart

  Judging by Figure 2.25, how many issues were there in January and February? I’ve plotted a single issue for each of these months. However, the way I read the chart, if I compare the bar height to the gridlines and follow it leftward to the y-axis, I’d estimate visually a value of maybe 0.8. This is simply bad data visualization. Don’t use 3D.

  Secondary y-axis: generally not a good idea

  Sometimes it’s useful to be able to plot data that is in entirely different units against the same x-axis. This often gives rise to the secondary y-axis: another vertical axis on the right-hand side of the graph. Consider the example shown in Figure 2.26.

  Figure 2.26 Secondary y-axis

  When interpreting Figure 2.26, it takes some time and reading to understand which data should be read against which axis. Because of this, you should avoid the use of a secondary or right-hand y-axis. Instead, think about whether one of the following approaches will meet your needs:

  Don’t show the second y-axis. Instead, label the data points that belong on this axis directly.

  Pull the graphs apart vertically and have a separate y-axis for each (both along the left) but leverage the same x-axis across both.

  Figure 2.27 illustrates these options.

  Figure 2.27 Strategies for avoiding a secondary y-axis

  A third potential option not shown here is to link the axis to the data to be read against it through the use of color. For example, in the original graph depicted in Figure 2.26, I could write the left y-axis title “Revenue” in blue and keep the revenue bars blue while at the same time writing the right y-axis title “# of Sales Employees” in orange and making the line graph orange to tie these together visually. I don’t recommend this approach because color can typically be used more strategically. We’ll spend a lot more time discussing color in Chapter 4.

  It is also worth noting that when you display two datasets against the same axis, it can imply a relationship that may or may not exist. This is something to be aware of when determining whether this is an appropriate approach in the first place.

  When you’re facing a secondary y-axis challenge and considering which alternative shown in Figure 2.27 will better meet your needs, think about the level of specificity you need. Alternative 1, where each data point is labeled explicitly, puts more attention on the specific numbers. Alternative 2, where the axes are shown at the left, puts more focus on the overarching trends. In general, avoid a secondary y-axis and instead employ one of these alternate approaches.

  In closing

  In this chapter, we’ve explored the types of visual displays I find myself using most. There will be use cases for other types of visuals, but what we’ve covered here should meet the majority of everyday needs.

  In many cases, there isn’t a single correct visual display; rather, often there are different types of visuals that could meet a given need. Drawing from the previous chapter on context, most important is to have that need clearly articulated: What do you need your audience to know? Then choose a visual display that will enable you to make this clear.

  If you’re wondering What is the right graph for my situation?, the answer is always the same: whatever will be easiest for your audience to read. There is an easy way to test this, which is to create your visual and show it to a friend or colleague. Have them articulate the following as they process the information: where they focus, what they see, what observations they make, what questions they have. This will help you assess whether your visual is hitting the mark, or in the case where it isn’t, help you know where to concentrate your changes.

  You now know the second lesson of storytelling with data: how to choose an appropriate visual display.

  chapter 3

  clutter is your enemy!

  Picture a blank page or a blank screen: every single element you add to that page or screen takes up cognitive load on the part of your audience—in other words, takes them brain power to process. Therefore, we want to take a discerning look at the visual elements that we allow into our communications. In general, identify anything that isn’t adding informative value—or isn’t adding enough informative value to make up for its presence—and remove those things. Identifying and eliminating such clutter is the focus of this chapter.

  Cognitive load

  You have felt the burden of cognitive load before. Perhaps you were sitting in a conference room as the person leading the meeting was flipping through their projected slides and they paused on one that looked overwhelmingly busy and complicated. Yikes, did you say “ugh” out loud, or was that just in your head? Or maybe you were reading through a report or the newspaper, and a graph caught your eye just long enough for you to think, “this looks interesting but I have no idea what I’m meant to get out of it”—and rather than spend more time to decipher it, you turned the page.

  In both of these instances, what you’ve experienced is excessive or extraneous cognitive load.

  We experience cognitive load anytime we take in information. Cognitive load can be thought of as the mental effort that’s required to learn new information. When we ask a computer to do work, we are relying on the computer’s processing power. When we ask our audience to do work, we are leveraging their mental processing power. This is cognitive load. Humans’ brains have a finite amount of this mental processing power. As designers of information, we want to be smart about how we use our audience’s brain power. The preceding examples point to extraneous cognitive load: processing that takes up mental resources but doesn’t help the audience understand the information. This is something we want to avoid.

  The data-ink or signal-to-noise ratio

  A number of concepts have been introduced over time in an effort to explain and help provide guidance for reducing the cognitive load we push to our audience through our visual communications. In his book The Visual Display of Quantitative Information, Edward Tufte refers to maximizing the data-ink ratio, saying “the larger the share of a graphic’s ink devoted to data, the better (other relevant matters being equal).” This can also be referred to as maximizing the signal-to-noise ratio (see Nancy Duarte’s book Resonate), where the signal is the information we want to communicate, and the noise are those elements that either don’t add to, or in some cases detract from, the message we are trying to impart to our audience.

  What matters most when it comes to our visual communications is the perceived cognitive load on the part of our audience: how hard they believe they are going to have to work to get the information out of your communication. This is a decision they likely reach without giving it much (if any) conscious thought, and yet it can make the difference between getting your message across or not.


  In general, think about minimizing the perceived cognitive load (to the extent that is reasonable and still allows you to get the information across) for your audience.

  Clutter

  One culprit that can contribute to excessive or extraneous cognitive load is something I refer to simply as clutter. These are visual elements that take up space but don’t increase understanding. We’ll take a more specific look at exactly what elements can be considered clutter soon, but in the meantime I want to talk generally about why clutter is a bad thing.

  There is a simple reason we should aim to reduce clutter: because it makes our visuals appear more complicated than necessary.

  Perhaps without explicitly recognizing it, the presence of clutter in our visual communications can cause a less-than-ideal—or worse—uncomfortable user experience for our audience (this is that “ugh” moment I referred to at the beginning of this chapter). Clutter can make something feel more complicated than it actually is. When our visuals feel complicated, we run the risk of our audience deciding they don’t want to take the time to understand what we’re showing, at which point we’ve lost our ability to communicate with them. This is not a good thing.

 

‹ Prev