The Unreliability of Self-Reported Survey Data: Insights from an Experiment on COVID-19 Hygiene Behaviours
Surveys and self-reported data: examining the biases that influence responses and proposing methodological strategies to enhance accuracy and reliability in behavioural research.
Human beings are creatures of habit. Ours days have a particular rhythm to them, and behaviours we exhibit today are generally indicative of how we behaved yesterday and will behave tomorrow. Yet for how routine our lives can be, behavioural research conducted the last 50 years has found that people tend to struggle when asked to recall specific details about their day-to-day life.
For example, can you remember how many times yesterday you checked your email, or how many glasses of water you drank? Odds are high that you did in fact check your email and drink a few glasses of water, but to recall and count these habitual events is easier said than done. Why? Because they are usually not registered in ‘episodic memory’.
The resulting inability for people to recall with, any sort of precision, the non-salient parts of their lives makes the use of tools like self-reporting surveys complicated. Of course, surveys are a fast, cheap, non-invasive approach to reaching large segments of the population and collecting data. Yet, if the data gathered does not accurately represent reality, it often makes surveys rather meaningless and misleading. This is why we regularly study the reliability and validity of self-reported survey data at iNudgeyou and publish our methodological insights in leading journals. In this Insights post we look into a fascinating and telling experiment published in the leading journal Behavioural Public Policy which is based on a strategy for evaluating the reliability, or lack thereof, of self-reported survey data.
Take your Organisation
26 – 28 may, 2025 COPENHAGEN
JOIN OUR NEXT MASTERCLASS
The Research Opportunity of ‘Hope’
The experiment was brought to life as the COVID-19 pandemic made its rapid spread in early 2020. Policy makers across the globe sought to determine how compliant citizens of their respective nations were to the hygiene and safety measures recommended by public health authorities. Large scale surveys, like Aarhus University’s HOPE project, were employed around the world and the gathered data were ultimately used to inform actions taken by governments. A prime example of this was the implementation of extended lockdowns in Denmark after data suggested a dip in the population’s compliance to governmental advice on hand hygiene and social distancing.
Given the importance placed on survey data in the COVID-19 setting, iNudgeyou decided to look closer at the reliability and validity of such results through a behavioural lens by offering up a small-scale version of the HOPE project’s survey poll – with a twist: rather than merely asking a nationwide representative sample of participants to recall and quantify COVID-19 relevant behaviours off the cuff, we provided anchoring points around which responses might be assessed.
For example, can you remember how many times yesterday you checked your email, or how many glasses of water you drank? Odds are high that you did in fact check your email and drink a few glasses of water, but to recall and count these habitual events is easier said than done. Why? Because they are usually not registered in ‘episodic memory’.
The resulting inability for people to recall with, any sort of precision, the non-salient parts of their lives makes the use of tools like self-reporting surveys complicated. Of course, surveys are a fast, cheap, non-invasive approach to reaching large segments of the population and collecting data. Yet, if the data gathered does not accurately represent reality, it often makes surveys rather meaningless and misleading. This is why we regularly study the reliability and validity of self-reported survey data at iNudgeyou and publish our methodological insights in leading journals. In this Insights post we look into a fascinating and telling experiment published in the leading journal Behavioural Public Policy which is based on a strategy for evaluating the reliability, or lack thereof, of self-reported survey data.
Experimental set-up
Partnering with the Gallup marketing firm, we surveyed 1001 Danish adults from June 9 to June 12, 2020, with a focus on 2 key behaviours: hand hygiene and social distancing. Survey respondents were asked the following questions designed to closely mimic the original HOPE project queries:
1. How many times did you wash/sanitise your hands yesterday?
2. How many times were you within 2 meters of another person for more than 2 minutes yesterday?
Prior to asking each of these questions, however, participants were instructed to identify if the frequency of their own behaviour was over, under, or equal to a benchmark value, or “anchor” point. For the washing/sanitising question, we employed 2 different anchor points: plausible high and low frequencies set at 3 for the low anchor and 30 for the high. In a similar manner, high and low anchors were assigned for the social distancing question, in which 3 was our low anchor, and 15 was the high.
The Anchoring(-and-Adjustment) Effect
One of the persistent effects that Kahneman and Tversky interpreted as resulting from a cognitive bias or heuristic was the influence of an initial, irrelevant value, called an anchor, on people’s subsequent estimates of a true value. In one version of their experiments on anchoring, participants were randomly ascribed to one of two groups. In one group participant spun a roulette wheel which was rigged so that it would stop at 10. In the other it was rigged so it would stop at 65. The two numbers provided the anchor. Participants were then asked whether the percentage of African nations in the UN was less or larger than the anchor and then asked for their estimate of the true value. The average estimate in the group with the low anchor was 25% which was significantly less than the 45% estimated in the group with the high anchor. This effect of an anchor on an estimate became known as ‘the anchoring effect’ where an initial, irrelevant anchor systematically influences the subsequent value-estimates of participants, see (Kahneman & Tversky, 1974). The reason why this effect may be considered as something more than a random curiosity is the systematic directedness or influence of the anchor as well as the fact that participants knew the roulette produced a random number, why it ought to provide no relevant information and have no influence on their estimates. – Hansen, 2025
References:
Hansen, P.G. (2025). Notes on Behavioural Insights – Theory, Methodology & Praxis, (forthcoming 2025).
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124-1131.
The Research Opportunity of ‘Hope’
Given that the existing body of research in behavioural science suggesting that survey results are prone to error and bias, it perhaps comes as no surprise that our investigation confirmed how anchors do impact how survey participants recall and report their own behaviours.
In the case of social distancing, those provided with the low anchor value reported an average of 6.7 close interactions the previous day while the high anchor group averaged 8.7 interactions, an increase of 2. Similarly, members of the low anchor hand hygiene group averaged 10.9 handwashes, while the high anchor group reported 18.1 average handwashes, a staggering increase of 7.2. (See graphs of averages mapped out below.)
Figure 1. Average treatment effect (%). Showing self reported data on Close Contact & Washed hands, using high and low anchor. N=1001.
In both the social distancing and hand hygiene scenarios, the degree of change between the low and high anchor averages was found to be statistically significant. It is particularly interesting how dramatic an effect anchoring had on hand washing compliance, however.
There is an apparent drop of nearly 40% in compliance when the low anchor was used – or a 66% increase for the high anchor, if you will! The reason for this, we speculate, is that anchoring bias responses more, the less the response is supported by memory. In case of social distancing during COVID-19 it should come as no surprise, that people would tend to better remember if they met up and talked to someone. Still, here the high anchor leads to an increase of 30% for the self-reported behaviour.
Implications
What then are the main takeaways from our experiment? Certainly, we have once again shown how anchoring can affect the outcome of survey data, sometimes to a rather substantial degree. Beyond this, however, are the larger implications of acknowledging survey limitations:
1. Whether participants have poor memory, feel societal pressure to inflate/deflate certain measures, or are prone to suggestibility, the end result is that surveys are (in most cases) simply not a reliable means of gathering accurate information on a populace’s routine behaviours.
2. Rather than relying on self-reported data, particularly in settings where erroneous results may have an impact on many, routine behavioural patterns should be observed and measured in their natural context.
With surveys of public health/routine behaviours becoming common during crises like the COVID-19 pandemic, it is crucial that policy makers understand the limitations of this data before using it to enact changes that affect their citizens across the board. iNudgeyou is dedicated to pairing our accumulated knowledge of human behaviour with the generation of real-world data to help in situations just like this.
Therefore, it is also important to notice that our experiments on the reliability and validity of self-reported survey data like the one discussed here not only expose how unreliable and non-valid the approach of collecting survey data may be, when studying routine behaviours. It also provides anchoring as a methodological test for evaluating and measuring the reliability and – when paired with real world data or theoretical considerations about human memory – the validity of collected self-reported survey data.
We are always looking for new opportunities to partner with organisations for similar projects that may make us all wiser on how to understand and study human behaviour.
Read the research article
Hansen, P., Larsen, E., & Gundersen, C. (2022). Reporting on one’s behavior: A survey experiment on the nonvalidity of self-reported COVID-19 hygiene-relevant routine behaviors. Behavioural Public Policy, 6(1), 34-51. doi: 10.1017/bpp.2021.13.
26 – 28 may, 2025 | in the heart of COPENHAGEN
JOIN OUR NEXT MASTERCLASS