How do you feel about statistics?

For a long time, I was a stats refusenik. Years ago at university, I took a class in mathematical statistics, and it never made any sense to me. My sporadic attempts to overcome my fears and learn to understand *p*-values, t tests, and *?2*, or chi-squared, had never come to anything.

But recently, I’ve been dealing with my fears and overcoming my resistance to learning statistics. I’m writing a book on surveys, and what’s the point of getting lovely, large-sample data if you can’t run a few statistical tests on it? So I knuckled down, bought a variety of introductory statistics books and tackled them. Plus, I took some college courses for good measure. And I’m pleased to report that all of my effort is sort of working. Although I’m definitely still a statistics newbie, I no longer automatically skip the statistical section in every academic paper I read. So that’s progress.

Which is all fine for me, but what has it got to do with you? Well, maybe I’m not the only one who has felt daunted by the vast choice of books on statistics and made a few false starts at reading them. So I thought it was worth picking out the ones that have helped me the most so far and sharing them with you, in the hope that you’ll also find them worthwhile.

*Disclaimer – I’m not trying to provide a list of statistics books that is either comprehensive or definitive. If you’re a statistics expert or have found a better way into the subject, great! Please add a comment, and I’ll be delighted to follow up on it.*

## The basics: getting familiar with a mean

Most introduction-to-statistics books seem to start with mean, median, and mode: the three most frequently encountered measures of central tendency and the three simplest descriptive statistics. If you are comfortable with all of these terms, skip ahead to the next section.

If you have already started feeling a touch alienated and confused, start with Darrell Huff’s *How to Lie with Statistics* (published in 1954 and widely available, in print or second hand, from about $5.)

I chose this book as my Survey Book of the Month for March 2011 because I’ve loved it ever since I bought my first copy as a teenager. It’s a witty guide to the basics such as:

- the difference between a mean and a mode
- the concept that correlation does not equal causation
- how to work out whether a graph is misleading.

Huff’s examples are redolent of the 1950s. If you’d prefer a more modern text or would like to go into similar topics in more depth, *Misused Statistics* by Herbert F. Spirer, Louise Spirer, and A. J. Jaffe, from 1998, covers much of the same ground, but it’s illustrated with many real examples the authors gleaned from serious newspapers and official documents.

## The immediately useful: basic UX statistics

The next step up is to use some statistics, perhaps any statistics, in our user experience work.

Tom Tullis and Bill Albert’s *Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics* has 10 pages that take you through the basics you’ll need in user experience:

- basic descriptive statistics
- comparing means
- correlations
- chi-squared tests

They describe typical UX problems and explain how to use the appropriate Excel 2007 functions – having published their book in 2008. From there, it’s not too hard to work out what to do in newer versions of Excel.

*Bonus – The rest of the book is well worth reading. It has lots of ideas for UX metrics that you can collect and a selection of case studies that show how the ideas work in practice.*

## To pass your statistics exam: many options

If you need to take a statistics class in college, there’s a simple answer to the problem of which book to choose, isn’t there? Yes, it’s whatever your instructor recommends. Or even better, whatever your examiner recommends.

I’ve amassed quite a large pile of getting-started-in-statistics books that are aimed squarely at this market—for example, Dawn Griffiths’s *Head First Statistics*. Depending on your point of view, you might describe it as a lively, fun book that introduces statistics concepts using amusing examples or as relentlessly and unbearably winsome. Either way, the book’s emphasis is squarely on helping you become familiar with statistical formulas in order to pass an examination.

Other options in the same category:

*Statistics for People Who (Think They) Hate Statistics*, by Neil J. Salkind

*Statistics for Dummies, Statistics for Dummies II*, and *Statistics Workbook for Dummies*, by Deborah Rumsey

*The Cartoon Guide to Statistics*, by Larry Gonick and Woollcott Smith, originally published in 1993 and reprinted in 2000

From the point of view of a UX professional who just wants to understand enough about statistics to use the formulas and tests on practical problems, I found that they all missed the point for me. There was too much emphasis on the details of the formulas, not enough emphasis on when to pick which test or formula to use, and little or no acknowledgement that I’m going to use a computer in most of them.

Salkind is an exception to that last point. My edition of his book gives instructions for using SPSS throughout – which is fine if you’re a student and your college gives you SPSS access, but not much help for the lone UX professional without SPSS or the $5000 necessary to purchase it. While checking my references for this article, I found out that he’s also written a version of this book that refers to Excel. I wish I’d known that before purchasing his book.

## To get familiar with the tests without formulas

Fortunately, I found one statistics book aimed at college students that firmly relegates all of the formulas to the back, where they’re out of sight: Frances Clegg’s *Simple Statistics: A Course Book for the Social Sciences.*

Clegg covers mean, median, and mode in the usual way, but before getting into normal distributions, she takes you into rank and sign tests. This appealed to me because these tests do not require you to make as many assumptions about the data, and in particular, do not require the data to come from a normal distribution.

Figure 1—The normal distribution reinterpreted as a plushie toy

The *normal distribution* is the bell-curve distribution, which has many convenient mathematical properties and is rather common in nature – for example, heights of students. Unfortunately, a lot of UX data doesn’t happen to follow a normal distribution – for example, think of the long neck, or Zipf, distribution that you get for search log data.

If I lost you at normal distribution, I apologise. Ignore what I’m saying, and think about reading Clegg instead. Like Huff, the book has a lightness of tone that appeals to me.

Because the formulas are at the back, you don’t encounter them until you’ve understood why you might need one and how to use it. And even when you do turn to them, you’ll find that Clegg describes every formula in terms of a series of simple steps before you get to the full-on mathematical symbols. If you don’t want to grapple with Excel’s statistical functions, you could possibly just use Excel as a sort of calculator to help you through the individual steps as she lays them out.

## To understand how to use statistical ideas

If you’ve learned about the basic tests and concepts, but still don’t feel confident in using that knowledge, try Andrew Vickers’s *What Is a P-Value Anyway?*

Vickers starts each chapter with a brief story that has a statistical spin, then discusses it in statistical terms. He’s got a witty turn of phrase and a wry sense of humor. I became a fan of this book after reading this in the opening remarks:

“Statistics textbooks can be long, boring, and expensive. With this in mind, I proposed to my editor that I write a book that was short, boring, and expensive. He considered it, but eventually decided I needed to come up with something better.”

I think he did, indeed, come up with something better. The book is definitely short at only just over 200 pages, and each little chapter presents a useful insight into the practical meaning and application of a statistical idea. For example, here is the opening to Chapter 14, “The Probability of a Dry Toothbrush: What Is a P-Value Anyway?”:

“I have a party trick: when I tell someone what I do, and they say, ‘statistician, eh? I took statistics in college,’ I ask them to define the p-value. (I know what you’re thinking—not much of a trick. I’m working on some other stuff.) The point is, I have yet to meet anyone who has got anywhere close to the right answer. This is pretty odd because the p-value is such a key idea in statistics. Imagine if a literature graduate didn’t know whether Shakespeare wrote plays or novels, or someone who’d taken an economics course couldn’t describe the relationship between supply and demand. So, if you do nothing else, please try to remember the following sentence: ‘The p-value is the probability that the data would be at least as extreme as those observed, if the null hypothesis were true.’ Though I’d prefer that you also understood it – about which, teeth brushing.”

Then he goes on to explain the p-value through stories of putting his son to bed and an experiment about scaring teenagers out of the criminal justice system.

Make sure you tackle the questions he poses in the discussion section at the end of each chapter and read the answers. I found the few moments this took to be well worth while, because the answers frequently clarified and helped me to grasp the points he’d made in a chapter.

I toyed with the idea of putting this book ahead of Clegg and the other textbooks, but I think it assumes that you have had some level of exposure to t-tests, p-values, regression, and the other basic statistical ideas.

## To get more ideas about how to apply statistics

Although I find Clegg’s book is the most approachable of the getting-started texts, and Vickers was definitely the one that started to give me some belief that I might really understand all of this some day, I want to suggest another book to complement them: John L. Phillips’s *How to Think About Statistics*, 6th Edition, published in 1999.

Clegg uses general examples; Vickers, mainly medical examples, with some excursions into other areas. The aspect of Phillips’s book that makes it well worth recommending is that, at the end of most chapters, he’s got “Sample Applications” from fields that overlap better with user experience: education, political science, psychology, social work, and sociology. He presents each sample application as a small problem you can think through. For example:

“Social work: You are the new director of a community fund-raising organization. The member agencies have widely varying needs, but you suspect that recently the board has been shirking its duty to investigate those needs. Specifically, you suspect that recent allocations have not been sufficiently differentiated. How might you document your case?”

The idea is that you think about the problem and how you would tackle it, then compare your answer to the author’s. I found these examples rather helpful in terms of starting to figure out how some of the various statistical techniques might work for me in user experience. In my mind, this example had some parallels with the problem of deciding how to fund different features in a long-term development project.

Another bonus for this book: it’s widely available second hand, so that makes it easier to recommend as a top-up.

## To learn about a wider range of statistics

Once you’ve got a firm grip on the normal distribution, regression, confidence intervals, t-tests, and chi-squared, the next step is to learn about how to do some of this in practice and learn about more of your options. For practical applications of statistics, I recommend Sarah Boslaugh and Paul Andrew Watters’s *Statistics in a Nutshell: A Desktop Quick Reference*.

I think the title is a misnomer – especially for the final chapters, which present “Specialized Techniques” that briefly introduce each topic, but often include a disclaimer along these lines: I can’t really tell you all you need to know about this – try reading XX textbook instead. That doesn’t seem like a quick reference to me. It seems like a recipe for quite a lot of extra hard work. Having done this bit of nit-picking, I think the book definitely opens up some topics that are very useful to know about – such as odds ratios and time series. These don’t appear in any of the books I’ve mentioned previously.

From the UX professional’s point of view, the most immediately useful chapter is the one on data preparation: “Data Management for Statistical Analysis.” The authors clearly understand that most of us don’t get our data neatly packaged for us by a kindly professor, but instead have to grapple with turning a lot of disparate stuff into some sort of coherent data set from which we need to draw some insights.

If you happen to be in a position to make choices about the tools you use, you’ll find it useful to read the discussions in the appendix about the various merits of statistical packages such as SPSS and R.

## To challenge the notion of statistical significance

By this point, you should have become thoroughly familiar with the ideas of statistical significance in general and p-values in particular. These underlie pretty much all the statistical tests that you’ll encounter in the introductory texts. The idea is to calculate some sort of test statistic, then use that to discover whether the observed results might or might not have happened purely by chance. For example, “significant p < 0.05” means that a calculation shows a probability of 5% or less that the observed result got that way by chance—in other words, a 1-in-20 chance.

The better books – such as Clegg, Phillips, Vickers, and Boslaugh and Watters – all include some discussion of effect size. The practical point is that a result can achieve statistical significance, but fail to be important – and similarly, a result can be important, but not be statistically significant.

We see this all the time in user experience – particularly in usability testing, where we so frequently watch user after user become bemused by some obvious disaster in our designs. We rush away, ready to make all sorts of changes based on our conviction that we’ve seen a major effect – only to have a tricky discussion about whether our sample was large enough to be statistically significant.

A book that delves into statistical significance, effect sizes, and the power of a test – that is, its ability to detect an effect – in great detail is Stephen T. Ziliak and Deirdre N. McClosky’s *The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives.*

It’s fair to say that the reviews of this book on Amazon are mixed, and I’d agree with that. The authors are economists, and some of the writing is hard going. One of the reviews recommends that people read Siu L. Chow’s *Statistical Significance: Rationale, Validity, and Utility* instead – and I might be happy to recommend that book as well, but it was published in 1997, is out of print, and not that easy to find, so I haven’t read it myself. If you happen to know of a book that tackles this topic, is easier to find than Chow’s book, and is easier to read than Ziliak and McClosky’s, please let me know.

## To learn how to critique a statistical argument

If you’re not completely exhausted by all of this, and you’re still feeling ready for a final challenge, read Robert P. Abelson’s *Statistics As Principled Argument*.

Abelson discusses statistics from the point of view of a professor who is critiquing the statistical arguments that colleagues, students, and other researchers have put forward. I’ll be honest with you: my first attempt at reading this book ended rather quickly because I was too unfamiliar with concepts such as a one-tailed test to grasp his arguments. Recently, a bit of light has begun to dawn for me, and I have enjoyed dipping into it again.

## The gap in the story: got this data, what to do now?

When reading Abelson, one of the problems for me was dealing with passages like the following, in which he talks about the difference between Liberal and Conservative styles of presenting statistical results and presents the investigator’s dilemma:

“Should I pronounce my results significant according to liberal criteria, risking scepticism by critical readers, or should I play it safe with conservative procedures and have nothing much to say?”

The challenge for me here: I don’t yet have *any* results! I still face this problem: *here is some data, what do I do with it? Which of the tests and techniques that I’ve learned about apply in this case?*

What I have learned is that this is the wrong way to go about taking a statistical approach. What you have to do is decide on the appropriate tests that can help you to answer your research question, then go get the data that allows you to run those tests.

## Summary – if you have time for only one book

Depending on your needs, here are my final recommendations:

- To get started with basic statistical ideas such as the mean and median – read Darrell Huff’s
*How to Lie with Statistics*, 1954. - To get a brief, but useful introduction to using statistics in your user experience work – get Tom Tullis and Bill Albert’s
*Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics*, 2008 - To get familiar with the most frequently taught statistical tests and ideas, without having to learn formulas – read Frances Clegg’s
*Simple Statistics: A Course Book for the Social Sciences*, 1990. - To understand how to use statistical ideas – go get Andrew Vickers’s
*What Is a P-Value Anyway?,*2010. - To get more ideas about how to apply statistics – add in John L. Phillips’s
*How to Think About Statistics*, 6th Edition, published in 1999. - To learn about a wider range of statistics – choose Sarah Boslaugh and Paul Andrew Watters’s
*Statistics in a Nutshell: A Desktop Quick Reference*, 2008. - To challenge the notion of statistical significance – skim and skip through Stephen T. Ziliak and Deirdre N. McClosky’s
*The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives*, 2008. - To learn how to critique a statistical argument – tackle Robert P. Abelson’s
*Statistics As Principled Argument*, 1995.

And if you have time for only one of these books, I’d opt for Andrew Vickers’s *What Is a P-Value Anyway?* It’s short, engagingly written, and realistic. Plus, it was the one book that gave me the most confidence that I’d actually understood what I was reading.

But irrespective of what I say – If you aim to pass a statistical exam, pick the textbook that your instructor or exam board recommends.

Acknowledgements: Thanks to Nicole and Shannon of Nausicaa Distribution for their permission to use the image of their plushie. Main image of Statistics books at WHO by ‘Swiveler‘, creative commons.

*This article first appeared in UX Matters, February 6 2012.*