Coronavirus (COVID-19) harmonisation guidance
|Publication date:||20 May 2020|
|Who this is for:||Users and producers of statistics|
|Type:||Harmonisation guidance and principles|
Development of these questions
The questions presented here have been developed to collect data about the impact of the coronavirus (COVID-19) pandemic in a harmonised way.
They have been adapted from government surveys. This process has involved changing the reference period and removing overly specific response options. Some wording has also been changed based on established questionnaire design principles and use of a consistent style guide.
The need for rapid development has meant that these questions have not been cognitively tested. Their development has been based on best practice principles and iterations with available evidence.
Although designed in accordance with best practice, these questions are experimental and may change as priorities evolve.
What is harmonisation?
Harmonisation is the process of making statistics and data more comparable, consistent and coherent. Harmonised principles set out how to collect and report statistics to ensure comparability across different data collections in the Government Statistical Service (GSS). Harmonisation produces more useful statistics that give users a greater level of understanding.
When it comes to collecting data about the impact of the coronavirus (COVID-19) pandemic we are proposing a harmonised set of questions. Given the lack of testing these are to be considered experimental and not a full harmonised principle.
What do we mean by the coronavirus?
Coronaviruses are a family of viruses that cause disease in people and animals. They can cause the common cold or more severe diseases, such as COVID-19.
COVID-19 refers to the “coronavirus disease 2019” and is a disease that can affect the lungs and airways. It is caused by a type of coronavirus. This set of harmonised questions relates to COVID-19, and refers to this as “the coronavirus” in line with the Office for National Statistics’ style guide.
Questions and response options (inputs)
The harmonised questions on this topic are designed to collect basic information, for use in the majority of surveys.
The choice of variables is based on priorities identified across government and how appropriate they would be to harmonise. They are not designed to replace questions used in specialist surveys where more detailed analysis is required.
|Diagnosis and symptoms (these two questions are to be used together - see the "Using these questions" section for more information)||Have you been officially diagnosed with the coronavirus (COVID-19)?||Yes
|Since January 2020, have you had coronavirus (COVID-19) symptoms? (Symptoms can include a high temperature or new continuous cough, or both)||Yes
|Worry||How worried, if at all, are you about the coronavirus (COVID-19) pandemic?||Extremely worried
Not very worried
Not at all worried
|Keyworker status||Due to the coronavirus (COVID-19) pandemic, have you been given “key worker” status?||Yes
|Impacts||Which areas of your life are being affected by the coronavirus (COVID-19) pandemic?|
Please select all that apply.
My household finances
My caring responsibilities
My access to groceries, medication or essentials
Other (please specify)
Using these questions
These questions can either be added to a wider block of questions exploring the coronavirus, or asked on their own.
If the diagnosis questions are used, they should be used together as the output of whether someone “has” or “has not” got coronavirus is based on combining responses to both questions.
Types of data collection this principle is suitable for
These questions are based on variables used in both interviewer administered and self-complete survey modes.
Presenting and reporting the data (outputs)
|Diagnosis||Sum unique “yes” values to the two diagnosis questions to output probable cases of the coronavirus.
Sum unique “no” values to the two diagnosis questions to output probable non-cases of the coronavirus.
Responses of “yes” to question one and “no” to question two output probable asymptomatic cases.
|Worry||Sum of each response option outputs levels of self-reported worry.|
|Keyworker status||Sum of each response option outputs levels of self-reported key workers.|
|Impacts||Sum of each response option outputs self-reported levels of each domain impacted.
Only output responses under “other” once aggregated or coded to different domains. Do not publish free text respondents provide.
From use on government surveys we know respondents are including both positive and negative impacts when responding to this question. Because of this, outputs from this question should be reported as areas affected not areas negatively affected.
Guidance for Devolved Administrations
Different policies across the UK nations may affect the outputs from these questions.
For example higher numbers of key workers may be down to a broader definition of what a key worker is and higher levels of diagnosis may be a result of different policies on testing.
In assessing comparability of statistics on the coronavirus, we have found two domains that benefit from extra guidance: key worker status and diagnosis.
Key worker status
Why collect this data?
The purpose of the variable key worker status is to ascertain which workers’ children are still permitted to attend school during a time of restricted schooling provision as a result of the coronavirus.
“Key worker” is the most common term used in the UK according to Google data, but the phrase “critical worker” is also sometimes used.
Geographical comparisons of key workers
The central UK government has provided a definition of what a key worker is based on sectors but also includes people “if [their] work is critical to the COVID-19 response”.
However, the definition varies slightly in the UK nations, with Scotland and Northern Ireland noting that it is flexible. Varied definitions across the UK may mean that UK-wide data is not always capturing the same thing in each nation, and as such may not be geographically comparable.
Self identification of key workers
Because there is flexibility in definitions, outputs based on occupation or industry may not be comparable to outputs based on self-identification as a key worker. Those who self-identify as a key worker are likely to be acting as though they are a key worker (for example going to work) whether or not they meet industry and occupation definitions. This means that to understand service provision needs, capturing the number of people who self-identify as a key worker is the most beneficial.
Surveys help us estimate cases
Without testing, we cannot know exactly how many people have the coronavirus. This means survey data on the topic is an estimate, and variance is to be expected.
Comparing survey data and test data
Survey questions are unlikely to be comparable to test data except in studies that use both survey and biological data.
One reason is that testing figures will miss cases because tests are only provided to a subset of people.
Another reason is survey questions which rely on self-reported symptoms will miss asymptomatic cases.
This means, testing data has higher accuracy, but survey data has more representative coverage. The decision of which of these is more appropriate for use will vary based on situation.
When comparing data on prevalence of the coronavirus, it is important to also understand whether the data is reporting new cases, current cases or cumulative cases.
Cumulative survey data relates to questions that ask whether someone has had the coronavirus at all, which provides data that cannot be compared to questions asking about whether someone currently has the coronavirus.
The Department for Health and Social Care and Public Health England have a live tracker for both cumulative and new cases. As this is based on testing data, which is only available on a specific subset of people, it should not be compared to survey data that aims to achieve a representative sample.
Levels of prevalence will also vary based on levels of testing. As such, when levels of testing are known to vary between samples, this should be noted when comparing outputs.
To aid harmonisation, we recommend adopting other harmonised principles that may be relevant in data collection during this time, such as:
- Demographic information
- Personal wellbeing
- General health
- Unpaid care
- Long lasting health conditions and illnesses; impairments; and activity restriction
- Employment variables
- Access to the internet – coming soon
Before collecting further data on this topic, we also suggest looking at information that has already been published, for example:
We are always interested in hearing from users so we can develop our work. If you use or produce statistics based on this topic, please get in touch: firstname.lastname@example.org.
This guidance will be reviewed regularly.