Human error in survey responses

In market research on May 29, 2012 by sdobney

When researchers talk about error in research, they normally mean statistical error – the idea that with a fair EPSEM sample, the research will measure the market in question to a quantifiable level of accuracy. The might also consider the effects of stratified sampling and weighting and how this reduces effective sample size but improve drill-down accuracy, and they will probably also consider the type of sampling – from full EPSEM (equal probability of selection) to ad hoc convenience sample for hard to reach groups. But there is also human error in survey responses which adds an extra layer of complication.Now when we talk about human error there is no malicious intent – it’s not deliberate cheating – it’s just that all human beings have a tendency to make errors (otherwise we’d all get 100% in exams), to forget things and be trying to help the survey or researcher by giving what they think will be expected to be the ‘right’ answer. For instance if you ask someone how important eco-friendly packaging is, the answer is likely to be very high because we reflect the social norms around us. In practice, when you look at what people buy the impact is likely to be much lower.

So researchers have to control for potential human-ness in the survey. A spontaneous question is asked before a prompted question so the prompting doesn’t influence the spontaneous response. We add mask options to hide the real test brand so individual’s can’t ‘help’ the research by saying yes to something, and dummy options (fake answers) to test the level of mis-association or incorrect clicks. We recognise that recall and recollection are less than perfect, particularly over longer time periods. Shoppers frequently mis-state when they last bought a product, or the number of purchases they have made.

In many instances the human brain judges things by the order in which the occur. Items near the top of a list are more likely to be selected than those at the bottom. The first product or advertising concept you show will tend to have different ratings and scoring to the second. The first acts as an ‘anchor’ point for comparison. In pricing studies the prices have to be framed or contextualised first.

So researchers take steps to rotate or randomise the order in which answers appear, or in which products are shown or tested, or which price points appear first. However, this doesn’t remove the error – it simply spreads the error equally across all the items on test, so each answer is equally biased. In assessing trends, we assume the error is constant and consistent and so can look at the trend – the change from one survey to the next – and say this is a fair representation of the measurement.

Even within the measurements taken we allow human-ness to creep in. Many surveys ask questions in terms of ratings using scales and scale points. Academic research has validated that scales provide a valid measurement and all long-term trend surveys show the measures to be repeatable at a survey level. However, at an individual level from one week to the next our use of the scale points might vary. This week we might be strongly agreeing, and the following week, we might slip to slightly agreeing because in many ways the scale points are relatively vague in meaning – even though they have statistical robustness. As conjoint and trade-off researchers we like to avoid scales for this reason (though we still use scales in certain circumstances).

The scales actually also have a hidden bias and that’s a bias towards positivism. Asked to rate things out of ten, most people will use a lot of tens, a few eights and nines, and if they’re not happy the rating might drop to seven. The scale encourages an over-expression of delight. Of course, things like the Net Promotor Score (NPS) recognise this problem and build on it. But the problem is endemic in scale-based measurement – people almost always feel they have to give a rating (even to something they know or care little about) and they will always tend towards giving a favourable view. What’s more if you’ve just purchased something, you’ll tend always to have a higher opinion of what you just purchased than something you didn’t buy.

We can’t remove these human tendencies, but we can do two things. The first is to ensure that the research is grounded by real data. If you’re measuring sales or market share then compare it to real sales or market share reported through a shop audit. If you start with a customer database, you can gather information that you know about and make a comparison. We often include respondent quality scoring on this basis for B2B studies.

The second thing is to reconsider how the questionnaire is built up. The ‘survey instrument’ to give it its scientific nomenclature is treated as a validated and verified tool. For face-to-face or telephone interviews it is considered essential that the interviewer asks the questions in the same way each time – no paraphrasing or hidden prompting. But the survey itself is largely a linear construction – it goes from A to B (possibly with a filter to section C) in the order the researcher wants and asking all the questions in the order the researcher wants answers to, whether or not they are relevant to the consumer or customer.

A different way is to recognise that there will always be human response-errors and then to allow individuals to answer the questionnaire in the order he or she wants (perhaps some sections in a given order). This is what we call a non-linear questionnaire. So for areas like satisfaction research, individuals answer about the sections they want to answer about in the order they want to answer the questions. If they didn’t care about the branch decor, or didn’t recall the letter in the post, they don’t answer the question by their choice. We simply show the individual areas they can tell us about. Of course we get bonus information – we get to see the order in which he or she selects topics, or see where topic areas are abandoned. The benefit is that for the respondent they engage only in the areas they think are worthwhile engaging with. It reduces the positivism bias, but perhaps gives a clearer reading.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: