Why I no longer recommend “How to lie with statistics”

a man with a broom sweeping numbers under a carpet
Irving Geis’ cartoon for the front cover of the book’s 9th edition

How to Lie with Statistics by Darrell Huff is one of the bestselling ever books about statistics and one I used to recommend. Its bright, readable style seemed to make it an accessible introduction to statistics, including what I believed to be a healthy scepticism about statistics in general.

Huff sold vast quantities of this book

The book was first published in 1954. I encountered it as a paperback in the 1970s. Somehow I lost my Pelican paperback – I can picture the white cover with a line drawing by Mel Calman. If I lent it to you, you can keep it, because although I enjoyed the Mel Calman illustrations, I decided to treat myself to the hardcover 9th edition that is also copiously illustrated, by Irving Geis. It’s an easy book to find for pennies plus postage as it’s been a bestseller for decades and sold well over a million copies.

Huff was an apologist for the tabacco industry

Sadly, I learned quite recently that the Geis’ cover illustration of someone sweeping data under a carpet was all too apt.

Huff, who died in 2001, was an apologist for the tobacco industry. In the 1950s and 1960s he was paid to ridicule the idea of cigarette-linked disease before Congress. He went on to write How to Lie with Smoking Statistics, intended to be a sequel to his bestseller – but it was never published. Alex Reinhart, Professor of Statistics and Data Science at Carnegie Mellon University tells the story of Huff’s smoking book in this open access paper published by the Royal Statistical Society:

Reinhart does a detailed takedown of Huff’s book, including pointing out that amongst many mistakes, Huff falls into the statistical trap of confusing the probability of obtaining the data given the hypothesis with the probability of the hypothesis:

… the most fundamental error in statistics: the fallacy of the transposed conditional. The probability of 1/64 is the probability of obtaining this result while assuming the treatment has no value; it cannot be the probability the treatment has no value. A hypothesis test cannot give the probability that Huff desires – only Bayes’ rule can, with a suitable prior belief. (Reinhart, 2014)

Huff creates doubt about statistics

Given that that Huff’s defence of tobacco was never published, why stop reading the original quirky bestseller?

It’s true that it does help to have a healthy scepticism about statistics. When you see a statistic quoted, ask where it came from, what accuracy seems sensible, and what the person publishing it might have to gain from it.

But solid statistics are also immensely useful. That’s the reason why almost every government pays good money to have a national statistical institute. Here in the UK, we have the Office of National Statistics. In the USA, there are several – the best-known being the Bureau of Labor Statistics and the Census Bureau. One of my favourites is the Central Statistical Institute of Trinidad and Tobago – I particularly like their page about agricultural production, especially the statistics on pawpaws:

Less fancifully, there are the statistics that dominated our lives from the start of the COVID-19 pandemic. For those of us who have believed the medical statisticians and their work on the effectiveness of vaccination, the undermining of statistics that started by Huff (and others, to be fair) has become painfully real as anti-vax campaigners borrow from the tactics once used by the tobacco industry. Tim Harford. economist, wrote a longer piece about Huff and the way his arguments have fuelled misinforation for the the Financial Times:

Chuck out your Huff

If you feel the need of some instruction in the basic ideas of statistics, such as the difference between a mean, median and mode, you no longer have to give money to Darrell Huff’s inheritors or to used-book dealers. There are plenty of online resources available such as this free, no-advertisting introduction by BBC Bitesize: