Make your charts more like maps

Pecha Kucha is Japanese for “chit-chat.” I’ve talked about the presentation style on my personal blog.

In a Pecha Kucha, the speaker is restricted to just 20 slides, and only 20 seconds per slide. It’s a 6 minute, 40 second burst of information. As for the one embedded below, here’s some context: Think of the various maps in your life. Mine come to mind easily. I see one every time I go home.

I have a map of the world on the wall of my apartment, with pins showing places I’ve visited and others for where I’d like to visit next, once this Covid nightmare ends. As you can imagine, it drives decisions and actions.

Another map is one we all have. It’s the small map in our phones, powered by GPS. That map drives decisions and actions every day.

But if you’re responsible for helping your bosses or clients understand data, what about those data dashboards or charts easily leads them to take action? In my experience, truly actionable charts are few and far between!

All too often our charts do not drive action

Why is that? In this Pecha Kucha, I explore that question, and show you how you can drive more consensus and action with your data visualizations by choosing and customizing charts to be more map-like. I pull two examples from my decades of experience as a marketing analytics consultant.

The data is all made up, but reflects real life business challenges, and how making your charts more like maps can help to bring your bosses or clients to decisions. And isn’t that what data-driven decision making is all about?

Enjoy.

How to lie with data visualizations, Part 2

This is a follow-up to my post from last month. In that Part 1, I wrote about how shifting the ranges for heat maps and starting bar charts at numbers greater than zero can deceive. In this installment, I want to introduce you to the greatest source of lies in data visualization: Cognitive biases of their authors.

I started my earlier post with an example torn from the headlines. I will again below.

If you follow the latest news in the U.S. about Covid, you may have seen this graphic shared over the weekend, included as evidence for emergency approval of convalescent plasma as a treatment for the infection. The evidence was provided by FDA Commissioner Stephen Hahn. His chief graphic is shown in the above image.

The graphic  appears to show a 37% reduction in deaths for Covid patients receiving infused plasma from donors who have survived the infection. Famously, Tom Hanks and his wife donated their plasma to help the research being conducted world-wide.

Rounding down to 35% in a press conference, “A 35 percent improvement in survival is a pretty substantial clinical benefit.”

It is, but that isn’t what the data shows.

After an outcry from what seemed like just about everyone in Covid research whose name was not Stephen Hahn, he apologized. The drop in mortality was from roughly 11% to 7%, which is a drop of somewhere around 4%, not 37%. It was also from research on a small sample size, weirdly without a control group, among many other concerning factors.

In an administration that is arguably more politicized than any in recent history, the impulse is to say this U.S. government official lied. Strictly speaking, he most certainly did — in both his words and his supporting data visualization. But the motivation could be less nefarious — or at least, more common.

Cognitive Biases Can Make Us Lie to Ourselves

Our brains are not the perfect instruments we delude ourselves into believing. Consider the Free Brian Williams episode of Malcolm Gladwell’s podcast Revisionist History about the fallibility of human memory. Or more relevant to this instance of (self-)deception, consider this roughly 10-minute excerpt of a talk I co-presented in 2018 at Adobe Summit in Las Vegas. In that talk I explain why clear, data-focused hypotheses are an important protection a data scientist has against cherry-picking or distorting data to pronounce a test a winner.

In that talk I explained that we as marketers exploit consumer cognitive biases every day, including through the use of A/B testing tools like Adobe Target. That’s a good thing, from a marketing perspective. I also maintain that there is strong evidence all of us, regardless of our training and discipline, probably have biases that literally pre-date us as humans. These biases can cause us to lie to ourselves, and through those self-deceptions, lie to our clients.

Everyone wants to report successes. But science is hard, and replication of test results is a persistent problem. In addition to clear hypotheses, we can also conduct an exercise called preregistration to save us from ourselves.

The Killer of Truth Is Calling From INSIDE THE HOUSE!

In my first installment I talked about how easy it is to distort numbers by using bad or lazy visualizations. In this second and last installment, I want to remind you that a far more pernicious murderer of the truth is your own best intentions. I’m willing to give Mr. Hahn the benefit of the doubt that he wasn’t so much lying as practicing wishful thinking, which blinded him from many clear contraindicators of his conclusion — including the fact that randomized trials of convalescent plasma at scale is dead simple to conduct. They’re indeed being done elsewhere in the U.S., such as at UCLA — where Tom Hanks donated his plasma — and in other countries around the world. Yet no country has rung the bell on such a decisive victory over this horrible virus.

Let his humiliation be a lesson to us all.

The Domestication of Artificial Intelligence

A version of this post originally appeared in my personal blog.


I’ve thought and read a lot about artificial intelligence (AI). Particularly, its potential threat to us, its human creators. I’m not much for doomsday theories, but I admit I was inclined to fear the worst. To put things at their most melodramatic, I worried we might be unwittingly creating our own eventual slave masters. But after further reading and thinking, I’ve reconsidered. Yes. A.I. will be everywhere in our future. But not as sinister job-killers and overlords. No, they will be extensions of us in a way I can only compare with that most beloved of domesticated creatures: The dog. For you to follow my logic, you’ll need to remember two facts:

  1. Our advancement as a species from hunter-gatherers to complex civilizations would not be possible without domesticated plants and animals
  2. Our collective fear of technology is often wildly unfounded

Bear with me, but you’ll also probably need to recall these definitions:

  • Domestication: Taking existing plants or animals and breeding them to serve us. Two examples are the selection of the most helpful plants and turning them into crops. Michael Pollan’s early book, The Botany of Desire: A Plant’s-Eye View of the World, will bring you a long way to seeing this process in action. As for animals, you may think of dogs as being mere pets, but early in our evolution as humans we bred the wolf to help us hunt for meat, and to protect us from predators. Before domestication, we pre-humans hunted in packs, and so did the wolves … never the twain shall meet. After this domestication, we ensured the more docile canines a better life, under the protection of our species and its burgeoning technologies (see definition below), and they delivered the goods for us by helping us thrive in hostile conditions. It was a symbiosis that turned our two packs into a single unit. No wonder the domesticated dog adores us so, and that we consider them man(kind)’s best friend.
  • Technology: Did you know the pencil was once considered technology? So was the alphabet. You may think of them merely as tools, but technology is any tool that is new. And our attitudes toward anything new always starts with fear. Douglas Adams put it this way: “I’ve come up with a set of rules that describe our reactions to technologies: 1.) Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works. 2.) Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it. 3.) Anything invented after you’re thirty-five is against the natural order of things.” Fear of technology not surprisingly spawned the first science fiction: Mary Shelley’s Frankenstein; or, The Modern Prometheus, a literal fever dream about a scientist’s hubris and the destruction it wrought upon himself and the world. This fear has a name: Moral panic. And it has created some pretty far-fetched urban myths.

In a Wall Street Journal piece, Women And Children First: Technology And Moral Panic, Genevieve Bell listed a few of these vintage myths. The first is about the advent of the electric light: “If you electrify homes you will make women and children … vulnerable. Predators will be able to tell if they are home because the light will be on, and you will be able to see them. So electricity is going to make women vulnerable … and children will be visible too and it will be predators, who seem to be lurking everywhere, who will attack.” And consider this even bigger hoot: “There was some wonderful stuff about [railway trains] too in the U.S., that women’s bodies were not designed to go at 50 miles an hour. Our uteruses would fly out of our bodies as they were accelerated to that speed.” Sounds messy. I don’t have to tell you about our modern moral panic surrounding A.I. Except there is a bit of reverse sexism going on, because this time it is male workers who are more the victims. Their work — whether purely intellectual or journeyman labor — will be eliminated. We’ll all be out on the street, presumably to be mowed down by self-driving cars and trucks.

The Chicken Littles had me for a while

So what changed? In the same week I read two thought-provoking articles. One was in The New Yorker, The Mind-Expanding Ideas of Andy Clark. Its subtitle says it all: The tools we use to help us think — from language to smartphones — may be part of thought itself. This long piece describes Clark’s attempt to better understand what consciousness is, and what are its boundaries. In other words, where do we as thinking humans end and the world we perceive begin? He comes to recognize that there is a reason we perceive the world based on our five senses. Our brains are built to keep us alive and able to reproduce. Nothing more. All the bonus tracks in our brain’s Greatest Hits playlist … Making art, considering the cosmos, perceiving a future and a past … these are all artifacts of a consciousness that moves our limbs through space.

To some people, perception — the transmitting of all the sensory noise from the world — seemed the natural boundary between world and mind. Clark had already questioned this boundary with his theory of the extended mind. Then, in the early aughts, he heard about a theory of perception that seemed to him to describe how the mind, even as conventionally understood, did not stay passively distant from the world but reached out into it. It was called predictive processing.

Predictive processing starts with our bodies. For instance, we don’t move our arm when it’s at rest. We imagine it moving — predict its movement — and when our arm gets the memo it responds. Or not. If we are paralyzed, or that arm is currently in the jaws of a bear, it sends the bad news back to our brains. And so it goes. In a similar way we project this feedback loop out into the world. But we are limited by our own sense of it. Domestication of canines was such a game-changer because we suddenly had assistants with different senses and perceptions. Together humans and dogs became a Dynamic Duo … A prehistoric Batman and Robin. But Robin always knew who was the alpha in this relationship. Right now there is another domestication taking place. It’s not of a plant or an animal, but of a complicated digital application. If that seems a stretch … If grouping these three together — plants, animals and applications — keep in mind that domesticating all of them means altering digital information.

All Life Is Digital

Plants and animals have DNA, or deoxyribonucleic acid. They are alive because they have genetic material. And guess what? It’s all digital. DNA encoding uses 4 bases: G,C,T, and A. These are four concrete values that are expressed in the complex combinations that make us both living, and able to pass along our “usness” to new generations. We’re definitely more complicated than the “currently” binary underpinnings of A.I. But as we’ve seen, A.I. is really showing us humans up in some important ways. They’re killing us humans at chess. And Jeopardy. So: Will A.I. become conscious and take us over? Clark would say consciousness is beyond A.I.’s reach, because as impressive as its abilities to move through the world and perceive it are, even dogs have more of an advantage in the consciousness department. He would be backed up by none less than Nobel Prize in Economics winner Daniel Kohneman, of Thinking, Fast and Slow fame. I got to hear him speak on this subject live, at a New Yorker TechFest, and I was impressed and relieved by how sanguine he was about the future of A.I. Here’s where I need to bring in the other article, a much briefer one, from The Economist. Robots Can Assemble IKEA Furniture sounds pretty ominous. It’s a modern trope that assembling IKEA furniture is an unmanning intellectual test. But the article spoke more about A.I.’s limitations than its looming existential threats. First, it took the robots comparatively long time to achieve the task at hand. In the companion piece to that article we read that …

Machines excel at the sorts of abstract, cognitive tasks that, to people, signify intelligence—complex board games, say, or differential calculus. But they struggle with physical jobs, such as navigating a cluttered room, which are so simple that they hardly seem to count as intelligence at all. The IKEAbots are a case in point. It took a pair of them, pre-programmed by humans, more than 20 minutes to assemble a chair that a person could knock together in a fraction of the time.

Their struggles brought me back to how our consciousness gradually materialized to our prehistoric ancestors. It arrived not in spite of our sensory experience of the world, but specifically because of it. If you doubt that just consider my natural and clear way just now of describing the arrival of consciousness: I said it materialized. You understood this as a metaphor associated with our perception of the material world. This word and others to describe concepts play on our ability to feel things. Need another example: This is called a goddamn web page. What’s a page? What’s a web? They’re both things we can touch and experience with our carefully evolved senses. And without these metaphors these paragraphs would not make sense. Yes, our ancestors needed the necessary but not sufficient help of things like cooking, which enabled us to take in enough calories to grow and maintain our complex neural network, and the domestication of animals and plants that led us to agriculture and an escape from the limitations of nomadic hunter-gatherer tribes (I strongly recommend Guns, Germs and Steel: The Fates of Human Societies for more on this), but … To gain consciousness, we also needed to feel things. And what do we call people who don’t feel feelings? Robots. “Soulless machines.” Without evolving to feel, should A.I. nonetheless take over the world, it’s unlikely they will be assembling their own IKEA chairs with alacrity. They’ll make us do it for them. Because our predictive processing makes this type of task annoying but manageable. We can even do it faster over time.

It’s All About The Feels

But worry not. Our enslavement won’t happen because — and I’m feeling pretty hubristic myself as I write this — we’re the feelers, the dreamers, the artists. Not A.I. Before we domesticated dogs, we were limited in where in the world we could roam, and the game we could hunt. After dogs, we progressed. We prospered. Dogs didn’t put us out of jobs, if you will, they took the jobs they were better at in our service. Inevitably, we found other ways to use our time, including becoming creatures who are closer to the humans we would recognize on the street today, or staring back in the mirror. We are domesticating A.I. Never forget that. And repeat after me: We have nothing to fear but moral panic itself.

How to lie with data visualizations, Part 1

The last time our world experienced a virus of Covid-19 lethality, our public discourse was not that different than today’s. Of course I’m talking about the 1918 Spanish Flu. Even its name is a lie. It was called that in U.S. newspapers not because Spain was the origin of the virus but because Spain was a neutral party in World War I, which was blazing at the time.

How’s that, you say? While other countries embroiled in the conflict didn’t want to spook their citizens back home by honestly assessing the toll the deadly flu was taking on their troops, Spain was more forthright. More than most countries, Spain even took precautions such as strict quarantining and social distancing.

Following the dictum that no good deed goes unpunished, Spain got to “own” the flu appellation in the U.S. press because of those actions, in a xenophobic slight not unlike the way French fries — as beloved during the time of our multi-country occupation of Iraq as they are today — became known for a time as “freedom fries.” Why? France did not choose to join our little alliance. And we all know how successful that nation-building adventure was.

It makes you wonder when we will start paying attention to what our European neighbors are thinking. They occasionally may be onto something.

Hey, we’ve got this!

So what has this got to do with data visualizations? Just that powerful systems often try to deceive as a way to hold onto power. This can involve the systems disclosing data they possess to their constituents in a way that assures the inattentive Hey, we’ve got this! I’m going to show you an example from my recent work, but first, let me show you the data visualization pants-on-fire deception that got me typing in my basement instead of enjoying this sunny Saturday afternoon ..

This is a set of charts — one from July 2, and the other from today, July 18. In those two weeks, can you spot the 50% increase in cases per thousand? (I verified today’s number on the Georgia Department of Health website just now, and confirmed the July 2 screencap is legit as well).

Made more infuriating when you realize human lives are at stake, the graphic is right below this statement: “The charts below presents [sic] the number of newly confirmed COVID-19 cases over time. This chart is meant to aid understanding whether the outbreak is growing, leveling off, or declining and can help to guide the COVID-19 response.”

Okay, quiz time: Can you spot the increase in this before-and-after map?

Map before

If you haven’t caught it yet, I’ll give you the explanation that @andishehnouraee provided: “[Georgia Governor Brian] Kemp’s health department keeps changing the numbers on the map’s color legend to keep counties from getting darker blue or red. 2,961 cases was Red on July 2. Now a county needs 3,769 cases to show red. The result: an infographic that hides data instead of showing it.”

I find this indefensible. I will say other graphics on the same page definitely show a spike. But if I’m a Georgian and I look for my county on the only map on the page, how am I to know that my community has half again more confirmed cases than it did two weeks ago? This data visualization “shell game” may persuade inattentive citizens of a given county to not social distance, or wear a mask in public, and thereby cause further infection in an actual, honest-to-goodness public health crisis.

[Some Redditors have said these maps were designed to show county-to-county differences, but I’m not buying it. When you choose a color scale, either keep the numbers the same for the colors or don’t use numbers at all, and show percentages. The typical citizen isn’t going to spend more than a few seconds looking at the map, and will get the wrong story from this one.]

How to deceive using shifting scales, with Excel as your accomplice!

The above is an example of: If you don’t like the data, change the base numbers. Another, more common way is to not clearly show your bar charts or line charts starting from zero where the axes meet. Let’s say I am trying to improve my abysmal running pace, and share my progress with the world (These are real numbers, but give me a break … I’m an old nerd, not an elite runner!):

Look at this glorious chart! I can hear you exclaiming: What progress you’ve made, Jeff! But is it really that impressive? Of course not! When I popped these numbers into Excel and hit Insert Bar Chart, Excel did some editorializing. It started my Y-axis at 9.8 minutes per mile. And in doing so, it made me look like to the inattentive as though I’ve halved my time since March!

Let me repeat: Excel did this handicapping of my pathetic times automatically.

To get a real world view of my running — a world which includes rare athletes who have completed  marathons at far less than half my personal best — here is the same graphic when the bars start at zero minutes per mile:

Not nearly the ego-boost, but it’s honest!

Scale breaks to the rescue

What if you just don’t have the room for all of that athletic plodding? In other words, what if my sad pace just wouldn’t fit on the slide, the bars being too tall? Yes, that’s a real thing, as I’ve pointed out in data visualization lectures:

Sometimes you want to take your audience all the way to the treetops, where the trunks are invisible but you can see which are the tallest of the majestic redwoods.

There’s this thing called a scale break  (shown below with a made-up data set):

Now you can see a chart that tells an honest story without messing up the scale of your slide. And you can focus your audience’s attention on the data that matters.

Watch this blog for a follow-up, with more tips on how to lie with data visualizations!

Build a Google Analytics campaign spreadsheet that also crafts the links!

Dorcas Alexander wrote on the Luna Metrics blog recently about an important and often-overlooked topic: Organizing the campaign information you can gather in Google Analytics. I’m following up here with a way to document your campaigns. This method also solves the problem of constructing the special URLs used to create those campaigns in the first place.

If that seems a little opaque to you, read on. I suggest you start with this excerpt of Dorcas’ post:

It’s so easy to tag your campaigns for Google Analytics that you can quickly fill your reports with a mishmash of labels and end up with campaign tag soup! But what’s the best way to get organized? Even if you know what medium and source mean, it’s not always obvious how you should fit campaign info into those slots. And what about the extra slots we get for campaign tags like campaign and content and term?

It goes on to list four simple steps to preventing confusion. The fourth discusses documenting your work. It recommends how — by setting up a Google Docs spreadsheet, which can be shared among all content or analytics team members. He goes on to say, “Another good thing about using a spreadsheet is that a formula can pull all your labels together into a campaign-tagged URL.”

That’s a great idea, but how exactly can this be done?

Here’s my how-to, an addendum to that Luna Metrics post.

Above is the Google Spreadsheet I created for a former client (I needed to stop working with them when I joined Accenture). I’ve replaced the live information they were using with some of my own, to protect confidentiality. I’ll assume you already know how to set up a free Google Docs account, which includes the use of their cloud-based Excel competitor, named Spreadsheet.

  1. Create five columns: Output URL, Target URL, Formula, Campaign, Source and Medium. But wait!, you say. Where is that third column? It’s the Formula column, and is hidden here. I hid it because, a.) It looks identical to Output URL when you have live data in there, so it was redundant, and b.) I prefer to keep it hidden because each cell of that column contains the same formula — one that you definitely don’t want to accidentally change or delete. If I were setting up the system in Excel, I’d make those cells protected.
  2. Before “hiding” column C, place this formula in it: =((((((((B2&IF(ISERROR(FIND(CHAR(63),B2,1)),"?","&"))&"utm_campaign=")&D7)&"&utm_source=")&E2)&"&utm_medium=")&F2)) This formula confirms that the target URL (in cell B2) does not already contain a question mark in it. If it finds one already, none will be added. If it finds no question mark, it added one. After that it builds a trailing URL string that will be familiar to those who roll their own URLs, or use Google’s URL Builder. Once you’re done you’re safe to highlight the column and hide it.
  3. In the Output URL column, place a far smaller formula: =C2 Yes, that’s all. Just display the contents of the hidden cell C2 in the visible cell B2.
  4. Populate the Target URL cell in that row with the web address of the landing page you want to tag with campaign information.
  5. Finally, fill in the Campaign field, along with the Source and Medium fields. These are the unique names of the campaign you wish to credit that visit to, along with the web site or social app it was came from (e.g., Twitter, or Jason Falls’ Social Medial Explorer blog), and the general medium (e.g. social, or web).

That’s it! In the Output URL you’ll find the line. Copy it, and paste it wherever you are setting up a hyperlink on another site or digital channel. For example, that top line shows the URL I used when I was Tweeting about my recent blog post extolling the new release of an Excellent Analytics upgrade.

In the rows to the right of those I’ve shown, you can make notes about when it was used, why, and how you promoted the link. All of this can be helpful when you pull the campaign, source and media statistics for analysis.

I hope this helps. Let me know what improvements you might have experienced in how to catalog your campaign information.