Which is better: an open question or a closed one? Should you include a âdonât knowâ option in your closed questions? Is there a ârightâ order for asking questions?
If topics like these concern you, then youâll want to read Questions and Answers in Attitude Surveys: Experiments on Question Form, Wording, and Context by Howard Schuman and Stanley Presser. (1996, reprinted in 1981)
Schuman and Presser give us a shortcut into the research literature
Although this book hasnât been updated since 1996, it continues to be much-cited. Why? Because the authors conducted a series of experiments on different ways of asking questions, and then report on all of them in this one convenient volume. They also reviewed a swathe of the relevant literature. So itâs a sort of shortcut into the research on question wording from the 1950s to 1990s, an era where much research was done that is still relevant today, but the papers are often hard to get hold of.
In UX, we often suffer from reports of exactly one experiment in a limited context with a small, unrepresentative group of participants that are then offered up as âfactâ as if they applied to everyone. If you, too, find that sort of over-large claim highly irritating, then youâll enjoy reading this book. Itâs full of examples where the authors tried to probe and replicate. Often, they found that an early compelling result didnât actually replicate as they hoped â which sometimes means they are less than conclusive in their recommendations, but far more realistic.
A book to borrow rather than buy
I have to admit itâs not exactly a zippy read. If youâre a regular reader of the type of academic papers that quote a lot of âpâ values, then youâll probably rattle along. But even so, youâll need to exert some imagination. The examples are obviously all from an earlier era, and many of them explore political problems that are now no longer part of our everyday concerns.
So Iâm going to pull out some of the key findings for you here.
The order of the questions is important
The first topic they tackle in depth is question order. There are some famous experiments that manipulated question order, such as with these two items:
- Do you think the United States should let Communist newspaper reporters from other countries come in here and send back to their papers the news as they see it? (âCommunistâ item)
- Do you think a Communist country like Russia should let American newspaper reporters come in and send back to America the news at they see it? (âAmericanâ item)
(I told you these examples are often from eras when concerns were different). These were first used in an experiment in 1948, then replicated by the authors. In both cases, asking the Communist item first got a much lower level of âyesâ answers than if the American item was asked first.
This is a âcontext order effectâ, also known as a âcontext effectâ. Each question is affected by the context within which it is asked, and that context includes the previous question.
The problem with context order effects is that although they undoubtedly exist, they are tricky and slippery. The authors tried various different experiments to try to pin them down, but failed: they certainly replicated some effects, but not others; they found effects that were larger than expected, and others that were smaller. They found no straightforward explanation for what might be going on.
As the authors put it in their summing up of the chapter:
â[Context effects] can be very large [and] are difficult to predictâ.
The bottom line: question order is important. If you want to run the same survey again and plan to compare the results, make sure that you keep the question order the same each time.
Open questions elicit a wider range of answers, but are not as open as they seem
Closed questions are ones where the respondent has to pick from a range of specific answers, sometimes including âdonât knowâ and âprefer not to answerâ. Open questions have an open space for the answers and respondents can choose to provide as short or long an answer as they wish.
The chapter on open versus closed questions reports on experiments that compared the number and range of answers that each type of question can elicit. Broadly, an open question will collect a much wider selection of answers including some that you would never have guessed youâd get.
Unfortunately, open questions also pose problems for analysis, because youâve got to read the answers and try to put them into categories yourself: and in doing that, thereâs a risk of misinterpreting the respondentâs original intention.
But closed questions have their own problems, as Iâm sure youâll recognise if youâve had the experience of trying to respond to a survey where the survey author continually forced you to choose from answers that donât resemble the one you want to give.
Hereâs how the authors sum up the issues:
âInadvertent phrasing of the open question itself can constrain responses in unintended ways ⊠we can see no way to discover subtle constraints of this kind except by including systematic open-closed comparisons when an investigator begins development of a new question on values and problemsâ.
Their recommendation about how to get the balance of open and closed questions right? Iteration! In other words:
- Explore your questions in interviews with users,
- Test the questions,
- Check the results and make changes,
- Test again, and repeat until it all settles down.
And here is another point from the book that is well worth thinking about:
âsince our results fail to provide strong support for the superiority of open questions, the implication may seem to be that after sufficient pilot work an investigator can rely exclusively on closed items. But we think that total elimination of open questions from survey research would be a serious mistakeâŠ. Open âwhyâ questions can also be especially valuable as follow-ups to important closed questions, providing insight into why people answer the way they do⊠They are needed as well where rapidly shifting events can affect answers, or indeed over time to avoid missing new emergent categories. And of course in some situations the set of meaningful alternatives is too large or complex to present to respondentsâ.
If you offer the option of âdonât knowâ, some people will take it
The chapter on âThe Assessment of No Opinionâ reports on experiments that compared a question without a âdonât knowâ filter and the equivalent question with one.
For example here are the unfiltered and filtered ways of asking a question:
- âIn general, do you think the courts in this area deal too harshly or not harshly enough with criminals?â
- âIn general, do you think the courts in this area deal too harshly or not harshly enough with criminals, or donât you have enough information about the courts to say?â
The unfiltered question got 6.8% âdonât knowâ answers; the filtered version got 29.0% ânot enough information to sayâ answers.
So including a âdonât knowâ filter is very likely to increase the proportion of who answer with âdonât knowâ. Why? Because a respondent might opt for âdonât knowâ for all sorts of reasons, including:
- Have thought about it and have yet to make up my mind
- Havenât thought about it
- Have an opinion but donât want to reveal it to you
- Sort of remember having an opinion but canât be bothered to recall it.
Does this matter? Schuman and Presserâs results show us that yes, it does matter.
If your respondents genuinely donât have an answer, then forcing them to choose an answer will produce unreliable results. But if they might really have an answer but donât want to make the effort of finding it, then offering them a âdonât knowâ option will lead to under-reporting of the true answers.
The bottom line: If you have the resources to test your questionnaire thoroughly through all its phases (preliminary investigative interviews, cognitive interviewing, usability testing, and pilot testing) then you will almost certainly have accurate questions that your respondents have answers for, and you wonât need a âdonât knowâ option. Otherwise: keep it in. A bit of accurate under-reporting is better than a pile of random unreliability.
Sometimes âno opinionâ is a valid opinion⊠and sometimes it isnât
Some years before this book was written, a famous series of experiments asked Americans about the Agricultural Trade Act, and found that many people were entirely happy to volunteer an opinion on it even though it didnât exist.
These authors did not like the idea of tricking their respondents, and opted to ask about a real bill that they anticipated few people would really know about: the Monetary Trade Bill. As they put it:
âRespondents make an educated (though often wrong) guess as to what the obscure acts represent, then answer reasonably in their own terms about the constructed objectâ.
This issue has not gone away. In a survey in 2011 in local government in the UK, one team found that their respondents were enthusiastic about âfree schoolsâ. Well, wouldnât you be?
But in fact, this is not asking about whether parents should pay for their childrenâs education or not. A âfree schoolâ in this context refers to a particular new way of setting up a school and concerns its governance not its charges to parents.
Back to the book. After some considerable experimenting, the authors conclude that:
âa substantial minority of the public â in the neighborhood of 30% â will provide an opinion on a proposed law that they know nothing about *if* the question is asked without an explicit âdonât knowâ optionâ.
The bottom line: You may be collecting opinions from informed people. Or you may not. Donât base important decisions on data collected from people who didnât know what you were talking about, but gave you an opinion anyway (maybe because you didnât offer them a âdonât knowâ option).
Balance your questions for more accuracy
Polling bias occurs when respondents are asked a question that implies a particular answer.
There was a wonderful exhibition of polling bias in the British TV show âYes, Prime Ministerâ. Sir Humphrey, the senior civil servant, demonstrates how you can get people to agree or disagree with a policy â in this case, compulsory national service in the military â by a careful selection of a series of questions. (This clip from King of the Paupers includes the relevant section. Some people have had trouble viewing the clip, so there is a transcript at the end of this post).
Schuman and Presserâs next chapter, âBalance and imbalance in questionsâ, explores questions of the form: âSome people think A, others think B, what is your view?â using the examples of questions on gun control and abortion. The idea of adding the both arguments is to reduce the possibility of building bias into the question.
They found that if a question clearly implies the possibility of a negative, then adding an opposing argument to make that explicit doesnât make much difference.
The bottom line: if you are writing question about attitudes, try writing the question in both directions i.e. positive and negative. Think about whether you are pushing people in one direction or the other. Aim to be neutral.
The tendency to agree (âacquiescence biasâ) is not as strong as sometimes claimed
Polling bias is an extreme form of another question-response problem, âacquiescence biasâ.
This is the tendency to agree. We saw it operating in Sir Humphreyâs humorous series of questions on the TV show, and it is one of the arguments for swapping the order of some statements when asking people about a series of aspects of something eg in the System Usability Scale.
They call this chapter âThe Acquiescence Quagmireâ, because despite lots of literature on the topic going back to Likert himself, they found that the effects of acquiescence bias are much less clearcut as than they expected.
For example, they mention a study by Lenski and Leggett from 1960, which looked at these two questions:
- It is hardly fair to bring children into the world, the way things look for the future.
- Children born today have a wonderful future to look forward to.
Although the original study claimed that contradictory answers on these two questions were evidence of acquiescence bias, Schuman and Presser point out that it is quite possible to disagree with both statements. They experimented with the question
âWhich in your opinion is more to blame for crime and lawlessness in this county: individuals or social conditions?â
and found that what is happening is not at all obvious.
The wording of the question is really crucial,and there are other complicating factors such as the level of education of respondents and, for some types of question in face to face interviews, the race of the respondent compared to the race of the interviewer.
The bottom line: if you need to explore levels of agreement with opposite opinions, then the biggest mistake you can make is to assume that your two opposite opinions are actually the full set.
Your respondents may care nothing about your issue – or much more
The authors open their chapter âPassionate Attitudes: Intensity, Centrality, and Committed Actionâ with a quote from âA preface to democratic theoryâ by R. A. Dahl (1956):
âWhat if the minority prefers its alternative much more passionately than the majority prefers a contrary alternative?â
This was an issue that Iâve run into a few times, for example some years ago when I was working on a survey of user experience professionals on behalf what was then called the Usability Professionalsâ Association (UPA). Some members wanted UPA to introduce a usability certification, and we did indeed find a majority of our respondents was in favor â but there was an important minority that was deeply against.
Schuman and Presser offer these three definitions to help us think through the issues:
- Intensity is the subjective strength of the attitude
- Centrality is subjective importance to the respondent
- Committed action is doing something about it e.g writing to your senator.
Their examples include investigation of attitudes in the US towards gun control. They found that people who were against gun control were good at âcommitted actionâ, so had a greater impact even though there were fewer of them.
The precise questions that Schuman and Presser were investigating were big national political matters, and hardly the stuff of our everyday practical concerns in user experience.
The underlying issues, however, are very much part of what we have to grapple with. Remember Google Buzz? Most users were happy with it; a very vocal minority was enraged by its privacy policies. Their âcommitted actionâ, the intensity of their attitudes, and the centrality of the issue for them, combined to undermine the credibility of the product; Google announced that they were closing it down in October 2011.
Despite that sad story, there is often a big gap between what people say their attitude is and how much theyâre prepared to act on it. As Schuman and Presser point out:
âpeople find it easier to say that they feel extremely strong about an issue than that they would regard it as one of the most important issues on which to decide their voteâ
Attitudes can be crystallised or wobbly
An attitude is âcrystallisedâ if it exists before you ask about it, and it is stable. Asking the same question another time will get the same answer. Schuman and Presser donât have a particular term for the opposite of crystallised, so letâs say the opposite is âwobblyâ.
Wrongly, we tend to treat all attitudes as crystallised. Schuman and Presser found that people are quite good at saying how strongly they feel about a topic. If people donât care much, their attitude is much more likely to be wobbly.
One aspect they investigated: whether people with more eduction were more likely to have crystallised attitudes on issues of national political policy. The 1960s idea was that if you had a college education, you ought to be firmer in your views. Schuman and Presser found that a higher level of education had an effect on some items but not on others.
Thirty years later, and writing from a British perspective, I find this focus on education quite surprising:I wouldnât assume that longer exposure to education necessarily makes people have clearer political opinions.
âForbidâ is not the same as ânot allowâ
A: âDo you think the United States should forbid public speeches against democracy?â
B: âDo you think the United states should allow public speeches against democracy?â
Elmo Roper tested these two questions in an A/B test in 1940, and found that 54% of respondents who got statement A (âforbidâ) agreed with it, but only 25% of respondents who got statement B (âallowâ) agreed with it. Turning that around, 75% said the US should ânot allowâ public speeches against democracy â far more than the proportion who would âforbidâ them.
Schuman and Presser replicated the experiment in 1974, and twice more in 1976. They got the same effect (âforbidâ is not the same as ânot allowâ), although by then public opinion had changed â nearly 80% of their respondents were against âforbidâ, whereas about 55% were were in favour of âallowâ.
They got similar effects for some other topics, for example whether to forbid, or to not allow, cigarette advertising on TV. But not always. A question about âabortionâ tested against one asking about âending a pregnancyâ did not produce the same effect: it seemed that those two items were seen as exactly equivalent by respondents.
1976 is a long time ago: is this lack of equivalence still a problem? Answer: yes, probably.
For example, in 2000 Bregje Holleman published âThe Forbid/Allow Asymmetry: On the Cognitive Mechanisms Underlying Wording Effects in Surveysâ, reporting on another extensive series of experiments on the same problem. Like Schuman and Presser, she found that âforbidâ is not the same as ânot allowâ â mostly. But sometimes it is. So the effect persists.
âForbid/allowâ is a tenacious topic and once it has gripped you it seems hard to let it go.
I learned about Bregje Hollemanâs book from a review written by Harold Schuman â the co-author of the book Iâve been talking about here. Then it came up again for me at the European Survey Research Association conference in Lausanne in 2011, where Naomi Kamoen described one of her experiments on a similar set of questions: âHow easy is a text that is not difficult? Comparing answers to positive, negative, and bipolar questionsâ â and her supervisor is the same Bregje Holleman who wrote the book in published in 2000.
The bottom line: It all comes down to the specific wording of the actual question. For example, âsatisfiedâ is not the same as ânot dissatisfiedâ and definitely not the same as âdelightedâ.
And finally: Context effects are a serious hazard
Schuman and Presser wrap up their book with a chapter where they reflect on their findings, and the experience of running so many survey experiments. They mostly conclude that replicating results is harder than it looks â a useful point to remember when reading research papers in general, particularly if the results seem counter- intuitive.
They also muse on the overall challenge of âcontext effectsâ, another way of saying that the way respodents will answer questions is strongly affected by the way you ask the questions, and by the way that questions are ordered. For example, they say:
âGeneral summary type questions are especially susceptible to context effects and should probably be avoided if the needed information can be built up from more specific questionsâ
Key points to take away
Here are four key things I learned from this book that you may also find helpful:
- Start with open questions and test a lot
- If you want to collect informed opinion, offer a âdonât knowâ option
- When collecting attitudes towards statements, try to use balanced questions
- Ask for strength of opinion as well as direction of opinion
âââââââââââââââââââââââ-
Transcript of âYes Prime Ministerâ where Sir Humphrey demonstrates acquiescence bias.
Sir Humphrey: âYou know what happens: nice young lady comes up to you. Obviously you want to create a good impression, you donât want to look a fool, do you? So she starts asking you some questions: Mr. Woolley, are you worried about the number of young people without jobs?â
Bernard Woolley: âYesâ
Sir Humphrey: âAre you worried about the rise in crime among teenagers?â
Bernard Woolley: âYesâ
Sir Humphrey: âDo you think there is a lack of discipline in our Comprehensive schools?â
Bernard Woolley: âYesâ
Sir Humphrey: âDo you think young people welcome some authority and leadership in their lives?â
Bernard Woolley: âYesâ
Sir Humphrey: âDo you think they respond to a challenge?â
Bernard Woolley: âYesâ
Sir Humphrey: âWould you be in favour of reintroducing National Service?â
Bernard Woolley: âOhâŠwell, I suppose I might be.â
Sir Humphrey: âYes or no?â
Bernard Woolley: âYesâ
Sir Humphrey: âOf course you would, Bernard. After all you told you canât say no to that. So they donât mention the first five questions and they publish the last one.â
Bernard Woolley: âIs that really what they do?â
Sir Humphrey: âWell, not the reputable ones no, but there arenât many of those. So alternatively the young lady can get the opposite result.â
Bernard Woolley: âHow?â
Sir Humphrey: âMr. Woolley, are you worried about the danger of war?â
Bernard Woolley: âYesâ
Sir Humphrey: âAre you worried about the growth of armaments?â
Bernard Woolley: âYesâ
Sir Humphrey: âDo you think there is a danger in giving young people guns and teaching them how to kill?â
Bernard Woolley: âYesâ
Sir Humphrey: âDo you think it is wrong to force people to take up arms against their will?â
Bernard Woolley: âYesâ
Sir Humphrey: âWould you oppose the reintroduction of National Service?â
Bernard Woolley: âYesâ
Sir Humphrey: âThere you are, you see Bernard. The perfect balanced sample.â
featured image question1 by Virtual EyeSee, creative commons licence
#surveys #surveysthatwork