How to get yourself started in statistics

statistics books on a shelfHow do you feel about statistics?

For a long time, I was a stats refusenik. When I was doing my first degree back in the 1970s, I took a class in mathematical statistics but it never made any sense to me. My sporadic attempts to overcome my fears and learn to understand p-values, t tests, or chi-squared, had never come to anything.

But recently, I’ve been dealing with my fears and overcoming my resistance to learning statistics. I’m writing a book on surveys, and what’s the point of getting lovely, large-sample data if you can’t run a few statistical tests on it? So I knuckled down, bought a variety of introductory statistics books and tackled them. Plus, I took some college courses for good measure. And I’m pleased to report that all of my effort is sort of working. Although I’m definitely still a statistics newbie, I no longer automatically skip the statistical section in every academic paper I read. So that’s progress.

Which is all fine for me, but what has it got to do with you? Well, maybe I’m not the only one who has felt daunted by the vast choice of books on statistics and made a few false starts at reading them. So I thought it was worth picking out the ones that have helped me the most so far and sharing them with you, in the hope that you’ll also find them worthwhile.

Disclaimer – I’m not trying to provide a list of statistics books that is either comprehensive or definitive. If you’re a statistics expert or have found a better way into the subject, great! Please add a comment, and I’ll be delighted to follow up on it.

Note – December 2021 I’ve updated this article to include some others that I found while I was writing my book Surveys That Work: a practical guide for designing and running better surveys. Also, I’m no longer able to recommend Darryl Huff’s book
.

The basics: getting familiar with a mean

Most introduction-to-statistics books seem to start with mean, median, and mode: the three most frequently encountered measures of central tendency and the three simplest descriptive statistics. If you are comfortable with all of these terms, skip ahead to the next section.

There are plenty of online resources available such as this free, no-advertisting introduction by BBC Bitesize:

For those of us who prefer an actual book, then try this one, by the people who originally devised the popular BBC programme ‘More or Less’. It has the same style: sharp stories, a lightness of touch, and solid information. Also widely available second-hand.

If you’d prefer a more modern text or would like to go into similar topics in more depth, this one covers much of the same ground, but it’s illustrated with many real examples the authors gleaned from serious newspapers and official documents.

The immediately useful: basic UX statistics

The next step up is to use some statistics, perhaps any statistics, in our user experience work. Tom Tullis and Bill Albert’s Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics (Elsevier/Morgan Kaufmann 2008) has 10 pages that take you through the basics you’ll need in user experience:

  • basic descriptive statistics
  • comparing means
  • correlations
  • chi-squared tests

They describe typical UX problems and explain how to use the appropriate Excel 2007 functions – having published their book in 2008. From there, it’s not too hard to work out what to do in newer versions of Excel.

Or if you have a fairly good grasp of statistics and want to use Excel as your statistical tool then Statistical Analysis: Microsoft Excel 2010 by Conrad Carlberg is one you may want to look at. I especially enjoyed Carlberg’s reminder  that ‘Reality is messy’.

Bonus – The rest of Tullis and Albert’s book is well worth reading. It has lots of ideas for UX metrics that you can collect and a selection of case studies that show how the ideas work in practice.

To pass your statistics exam: many options

If you need to take a statistics class in college, there’s a simple answer to the problem of which book to choose, isn’t there? Yes, it’s whatever your instructor recommends. Or even better, whatever your examiner recommends.

I’ve amassed quite a large pile of getting-started-in-statistics books that are aimed squarely at this market. From the point of view of a UX professional who just wants to understand enough about statistics to use the formulas and tests on practical problems, I found that they all missed the point for me. There was too much emphasis on the details of the formulas, not enough emphasis on when to pick which test or formula to use, and little or no acknowledgement that I’m going to use a computer in most of them.

Some of the ones I tried, but didn’t find particularly helpful, include:

If you want one that is mostly about using a statistics package, then Neil Salkind publishes a series of books that are tightly integrated with statistical software.  I made the mistake of buying one that uses  SPSS throughout – which is fine if you’re a student and your college gives you SPSS access, but not much help for the lone UX professional without SPSS or the $5000 necessary to purchase it. While checking my references for this article, I found out that he’s also written a version of this book that refers to Excel. There’s also one that has the most popular statistics program in use now: R. I wish I’d known that before purchasing his book.

To get familiar with the tests without formulas

I found one statistics book aimed at college students that firmly relegates all of the formulas to the back, where they’re out of sigh, and has some fun cartoons:

Felt toy in shape of bell curve
Figure 1—The normal distribution reinterpreted as a plushie toy

Clegg covers mean, median, and mode in the usual way, but before getting into normal distributions, she takes you into rank and sign tests. This appealed to me because these tests do not require you to make as many assumptions about the data, and in particular, do not require the data to come from a normal distribution. The normal distribution is the bell-curve distribution, which has many convenient mathematical properties and is rather common in nature – for example, heights of students. Unfortunately, a lot of UX data doesn’t happen to follow a normal distribution – for example, think of the long neck, or Zipf, distribution that you get for search log data.

If I lost you at normal distribution, I apologise. Ignore what I’m saying, and think about reading Clegg instead.

Because the formulas are at the back, you don’t encounter them until you’ve understood why you might need one and how to use it. And even when you do turn to them, you’ll find that Clegg describes every formula in terms of a series of simple steps before you get to the full-on mathematical symbols. If you don’t want to grapple with Excel’s statistical functions, you could possibly just use Excel as a sort of calculator to help you through the individual steps as she lays them out.

To understand how to use statistical ideas

If you’ve learned about the basic tests and concepts, but still don’t feel confident in using that knowledge, try:

Vickers starts each chapter with a brief story that has a statistical spin, then discusses it in statistical terms. He’s got a witty turn of phrase and a wry sense of humor. I became a fan of this book after reading this in the opening remarks:

Statistics textbooks can be long, boring, and expensive. With this in mind, I proposed to my editor that I write a book that was short, boring, and expensive. He considered it, but eventually decided I needed to come up with something better.” (Vickers, 2010)

I think he did, indeed, come up with something better. The book is definitely short at only just over 200 pages, and each little chapter presents a useful insight into the practical meaning and application of a statistical idea. For example, here is the opening to Chapter 14, “The Probability of a Dry Toothbrush: What Is a P-Value Anyway?”:

I have a party trick: when I tell someone what I do, and they say, ‘statistician, eh? I took statistics in college,’ I ask them to define the p-value. (I know what you’re thinking—not much of a trick. I’m working on some other stuff.) The point is, I have yet to meet anyone who has got anywhere close to the right answer. This is pretty odd because the p-value is such a key idea in statistics. Imagine if a literature graduate didn’t know whether Shakespeare wrote plays or novels, or someone who’d taken an economics course couldn’t describe the relationship between supply and demand. So, if you do nothing else, please try to remember the following sentence: ‘The p-value is the probability that the data would be at least as extreme as those observed, if the null hypothesis were true.’ Though I’d prefer that you also understood it – about which, teeth brushing.” (Vickers, 2010)

Then he goes on to explain the p-value through stories of putting his son to bed and an experiment about scaring teenagers out of the criminal justice system.

Make sure you tackle the questions he poses in the discussion section at the end of each chapter and read the answers. I found the few moments this took to be well worth while, because the answers frequently clarified and helped me to grasp the points he’d made in a chapter.

I toyed with the idea of putting this book ahead of Clegg and the other textbooks, but I think it assumes that you have had some level of exposure to t-tests, p-values, regression, and the other basic statistical ideas.

Another book that helped me to understand more about the ‘why’ of statistics is:

Like the Vickers book, I felt that I was able to get more out of this once I already had a relatively good grasp on basic descriptive statistics and a beginner’s acquaintance with inferential statistics.

To get more ideas about how to apply statistics

Although I find Clegg’s book is the most approachable of the getting-started texts, and Vickers was definitely the one that started to give me some belief that I might really understand all of this some day, I want to suggest another book to complement them:

Clegg uses general examples; Vickers, mainly medical examples, with some excursions into other areas. The aspect of Phillips’s book that makes it well worth recommending is that, at the end of most chapters, he’s got “Sample Applications” from fields that overlap better with user experience: education, political science, psychology, social work, and sociology. He presents each sample application as a small problem you can think through. For example:

“Social work: You are the new director of a community fund-raising organization. The member agencies have widely varying needs, but you suspect that recently the board has been shirking its duty to investigate those needs. Specifically, you suspect that recent allocations have not been sufficiently differentiated. How might you document your case?”

The idea is that you think about the problem and how you would tackle it, then compare your answer to the author’s. I found these examples rather helpful in terms of starting to figure out how some of the various statistical techniques might work for me in user experience. In my mind, this example had some parallels with the problem of deciding how to fund different features in a long-term development project.

Another bonus for this book: it’s widely available second hand, so that makes it easier to recommend as a top-up.

To learn about a wider range of statistics

Once you’ve got a firm grip on the normal distribution, regression, confidence intervals, t-tests, and chi-squared, the next step is to learn about how to do some of this in practice and learn about more of your options. For practical applications of statistics, I recommend:

I think the title is a misnomer – especially for the final chapters, which present “Specialized Techniques” that briefly introduce each topic, but often include a disclaimer along these lines: I can’t really tell you all you need to know about this – try reading XX textbook instead. That doesn’t seem like a quick reference to me. It seems like a recipe for quite a lot of extra hard work. Having done this bit of nit-picking, I think the book definitely opens up some topics that are very useful to know about – such as odds ratios and time series. These don’t appear in any of the books I’ve mentioned previously.

From the UX professional’s point of view, the most immediately useful chapter is the one on data preparation: “Data Management for Statistical Analysis.” The authors clearly understand that most of us don’t get our data neatly packaged for us by a kindly professor, but instead have to grapple with turning a lot of disparate stuff into some sort of coherent data set from which we need to draw some insights.

If you happen to be in a position to make choices about the tools you use, you’ll find it useful to read the discussions in the appendix about the various merits of statistical packages such as SPSS and R.

To challenge the notion of statistical significance

By this point, you should have become thoroughly familiar with the ideas of statistical significance in general and p-values in particular. These underlie pretty much all the statistical tests that you’ll encounter in the introductory texts. The idea is to calculate some sort of test statistic, then use that to discover whether the observed results might or might not have happened purely by chance. For example, “significant p < 0.05” means that a calculation shows a probability of 5% or less that the observed result got that way by chance—in other words, a 1-in-20 chance.

The better books – such as Clegg, Phillips, Vickers, and Boslaugh and Watters – all include some discussion of effect size. The practical point is that a result can achieve statistical significance, but fail to be important – and similarly, a result can be important, but not be statistically significant.

We see this all the time in user experience – particularly in usability testing, where we so frequently watch user after user become bemused by some obvious disaster in our designs. We rush away, ready to make all sorts of changes based on our conviction that we’ve seen a major effect – only to have a tricky discussion about whether our sample was large enough to be statistically significant.

A book that delves into statistical significance, effect sizes, and the power of a test – that is, its ability to detect an effect – in great detail is:

It’s fair to say that the reviews of this book on Amazon are mixed, and I’d agree with that. The authors are economists, and some of the writing is hard going. One of the reviews recommends that people read Siu L. Chow’s Statistical Significance: Rationale, Validity, and Utility instead – and I might be happy to recommend that book as well, but it was published in 1997, is out of print, and not that easy to find, so I haven’t read it myself. If you happen to know of a book that tackles this topic, is easier to find than Chow’s book, and is easier to read than Ziliak and McClosky’s, please let me know.

To learn how to critique a statistical argument

If you’re not completely exhausted by all of this, and you’re still feeling ready for a final challenge, read:

Abelson discusses statistics from the point of view of a professor who is critiquing the statistical arguments that colleagues, students, and other researchers have put forward. I’ll be honest with you: my first attempt at reading this book ended rather quickly because I was too unfamiliar with concepts such as a one-tailed test to grasp his arguments. Recently, a bit of light has begun to dawn for me, and I have enjoyed dipping into it again.

The gap in the story: got this data, what to do now?

When reading Abelson, one of the problems for me was dealing with passages like the following, in which he talks about the difference between Liberal and Conservative styles of presenting statistical results and presents the investigator’s dilemma:

“Should I pronounce my results significant according to liberal criteria, risking scepticism by critical readers, or should I play it safe with conservative procedures and have nothing much to say?” (Abelson, 1995)

The challenge for me here: I don’t yet have any results! I still face this problem: here is some data, what do I do with it? Which of the tests and techniques that I’ve learned about apply in this case?

What I have learned is that this is the wrong way to go about taking a statistical approach. What you have to do is decide on the appropriate tests that can help you to answer your research question, then go get the data that allows you to run those tests.

Summary – if you have time for only one book

Depending on your needs, here are my final recommendations:

  • To get started with basic statistical ideas such as the mean and median – read The tiger that isn’t: Seeing through a world of numbers by Michael Blastland and Andrew Dilnot, 2008, Profile Books.
  • To get a brief, but useful introduction to using statistics in your user experience work – get Tom Tullis and Bill Albert’s Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics, 2008
  • To get familiar with the most frequently taught statistical tests and ideas, without having to learn formulas – read Frances Clegg’s Simple Statistics: A Course Book for the Social Sciences, 1990.
  • To understand how to use statistical ideas – go get Andrew Vickers’s What Is a P-Value Anyway?, 2010.
  • To get more ideas about how to apply statistics – add in John L. Phillips’s How to Think About Statistics, 6th Edition, published in 1999.
  • To learn about a wider range of statistics – choose Sarah Boslaugh and Paul Andrew Watters’s Statistics in a Nutshell: A Desktop Quick Reference, 2008.
  • To challenge the notion of statistical significance – skim and skip through Stephen T. Ziliak and Deirdre N. McClosky’s The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives, 2008.
  • To learn how to critique a statistical argument – tackle Robert P. Abelson’s Statistics As Principled Argument, 1995.

And if you have time for only one of these books, I’d opt for Andrew Vickers’s What Is a P-Value Anyway? It’s short, engagingly written, and realistic. Plus, it was the one book that gave me the most confidence that I’d actually understood what I was reading.

But irrespective of what I say – If you aim to pass a statistical exam, pick the textbook that your instructor or exam board recommends.

Acknowledgements: Thanks to Nicole and Shannon of Nausicaa Distribution for their permission to use the image of their plushie. Main image of Statistics books at WHO by ‘Swiveler‘, creative commons.

This article first appeared in UX Matters, February 6 2012.