Statistics as Fake-News

Variations on this graphic are much shared in the wake of terror events, as part of reassuring people that we have more important risks to worry about than terrorism. Public reassurance is good, genuinely valuable, but the relative risk implied is massively wrong, when it comes to planning appropriate actions – allocating resources and time to addressing the risk.

Here the Politico version.

So what’s wrong?

Sure all the circles represent the size of a risk. But they are quite different risks. The distributions of the probabilities of the “event” occurring inside each coloured circle are all different – they may look to the untrained eye like “normal” (gaussian) distributions but in important ways, they are almost certainly not. Similarly the distribution of the negative consequences of the event – to individuals and/or to populations may look typically random, presumed normal distributions, but again in very important respects they are not.

We are fooled by randomness, the normal look of a seemingly random distribution, and exploiting that look in a graphic of comparative size compounds the error. It simply reifies and reinforces the statistical error in easy to share (but wrong) memes. They are comparing pommes with pomegranates.

There are many classes of statistical risk distribution (See Taleb’s Real World Risk Institute for more. Fooled by Randomness is one of Taleb’s book titles. And I’ve written previously about Taleb.)

The rallying cry is “fat-tails“. At the thin end of low probability event distributions – the gaussian tails – the actual shapes vary massively and, with low probability events, this is the part of the distribution that matters most. Think about that. The 99.99% that looks normal, maybe a skewed normal, tells us nothing about the risk of the low probability event. NOTHING we didn’t already know; that the vast majority of us are unlikely to die in a terror event. Who knew? But of course we are massively affected by other consequences of such events as well as the effects and effectiveness of actions to counter them. This statistic tells us nothing.

The technical detail (in comparative differences between the different classes of distribution) lie in differences of effect on the individual and the population, different populations of different participants, and imbalances between high downside risk and upside cost-benefits of avoidance. The fat-tails and/or heavy-tails simply hide the true cost-benefit risks, unless we address these. And to understand these you need to be an expert, not simply someone who can recognise the reassuringly misleading size of a coloured dot. If you need to share information in simple graphics, ensure the classes of thing being compared are visually apparent.

It’s public misinformation. Rhetorical bullshit designed to attract funding to somebody’s pet project. It’s memetic. Catchy colourful graphics designed as click-bait to sell media links irrespective of the quality of the information content or the headline.

Lies, damn lies and statistics. Fake-news I think we call it these days.

(Hat tip to Terry Waites for the Twitter conversation that prompted this post.)

[Jan 2018 addition: Your lawnmower is not trying to kill you. This paper came out of Taleb calling BS on Royal Stats Soc rewarding a paper on terrorism risk assessment, that got attention thanks to the Kardashian social-media connection. It’s a classic long-fat-tail error. Taleb endorses this corrective paper. The point is if you hear it via social media without informed dialogue, even if it appears to come from the highest statistical authority in the land, it’s probably wrong.]

This site uses Akismet to reduce spam. Learn how your comment data is processed.