Coronavirus (COVID-19) harmonisation guidance

This harmonisation guidance is under development. The questions presented here have been developed to collect data about the impact of the coronavirus (COVID-19) pandemic in a harmonised way. They have been adapted from government surveys. This process has involved changing the reference period and removing overly specific response options. Some wording has also been changed based on established questionnaire design principles and use of a consistent style guide.

The need for rapid development has meant that these questions have not been cognitively tested. Their development has been based on best practice principles and iterations with available evidence. Although designed in accordance with best practice, these questions are experimental and may change as priorities evolve.

Policy details

Metadata item Details
Publication date:20 May 2020
Author:Rhys Fletcher
Approver:Sofi Nickson
Who this is for:Users and producers of statistics
Type:Harmonisation standards and guidance
Contact:

gsshelp@statistics.gov.uk

What is harmonisation?

Harmonisation is the process of making statistics and data more comparable, consistent and coherent. Harmonised standards set out how to collect and report statistics to ensure comparability across different data collections in the Government Statistical Service (GSS). Harmonisation produces more useful statistics that give users a greater level of understanding.

When it comes to collecting data about the impact of the coronavirus (COVID-19) pandemic we are proposing a harmonised set of questions. Given the lack of testing these are to be considered under development and not a full harmonised standard.

What do we mean by the coronavirus?

Coronaviruses are a family of viruses that cause disease in people and animals. They can cause the common cold or more severe diseases, such as COVID-19.

COVID-19 refers to the “coronavirus disease 2019” and is a disease that can affect the lungs and airways. It is caused by a type of coronavirus. This set of harmonised questions relates to COVID-19, and refers to this as “the coronavirus” in line with the Office for National Statistics’ style guide.

Questions and response options (inputs)

The harmonised questions on this topic are designed to collect basic information, for use in the majority of surveys.

The choice of variables is based on priorities identified across government and how appropriate they would be to harmonise. They are not designed to replace questions used in specialist surveys where more detailed analysis is required.

VariableQuestion(s)Response options
Diagnosis and symptoms (these two questions are to be used together - see the "Using these questions" section for more information)Have you been officially diagnosed with the coronavirus (COVID-19)?
Yes,
No,
Don't know
Diagnosis and symptoms (these two questions are to be used together - see the "Using these questions" section for more information)Since January 2020, have you had coronavirus (COVID-19) symptoms? (Symptoms can include: a high temperature; a new continuous cough; or a loss or change to your sense of smell or taste).Yes,
No,
Don't know
WorryHow worried, if at all, are you about the coronavirus (COVID-19) pandemic?Extremely worried,
Very worried,
Somewhat worried,
Not very worried,
Not at all worried,
Don't know
Keyworker statusDue to the coronavirus (COVID-19) pandemic, have you been given “key worker” status?Yes,
No,
Don't know
Impacts Which areas of your life are being affected by the coronavirus (COVID-19) pandemic?
Please select all that apply.
My health,
My work,
My education,
My household finance,
My well-being,
My caring responsibilities,
My relationships,
My access to groceries, medication or essentials,
Other (please specify)
Social distancingIn the last seven days, have you come closer than two metres (around three steps) to anyone that doesn’t live in your extended household? Yes,
No,
Don't know

Using these questions

Question placement

These questions can either be added to a wider block of questions exploring the coronavirus, or asked on their own.

If the diagnosis questions are used, they should be used together as the output of whether someone “has” or “has not” got the coronavirus is based on combining responses to both questions.

Types of data collection this standard is suitable for

These questions are based on variables used in both interviewer administered and self-complete survey modes.

Using this question in the Welsh language

This harmonised standard was designed in the English language. At present we do not provide a Welsh language translation, as user demand for this standard is UK wide and Welsh language testing has not been completed to ensure a translation is comparable and appropriate. Harmonised standards based on Census research have been tested in the Welsh language, which is why we are able to provide Welsh versions of them. If you are interested in using a Welsh language version of a harmonised standard that has not been translated, please contact us: gsshelp@statistics.gov.uk.

Presenting and reporting the data (outputs)

VariableOutput
DiagnosisSum unique “yes” values to the two diagnosis questions to output probable cases of the coronavirus.

Sum unique “no” values to the two diagnosis questions to output probable non-cases of the coronavirus.

Responses of “yes” to question one and “no” to question two output probable asymptomatic cases.
WorrySum of each response option outputs levels of self-reported worry.
Keyworker statusSum of each response option outputs levels of self-reported key workers.
ImpactsSum of each response option outputs self-reported levels of each domain impacted.

Only output responses under “other” once aggregated or coded to different domains. Do not publish free text respondents provide.

From use on government surveys we know respondents are including both positive and negative impacts when responding to this question. Because of this, outputs from this question should be reported as areas affected not areas negatively affected.
Social distancingSum unique "yes" values to output levels of self-reported social distancing violations.

Sum unique "no" values to output levels of self-reported social distancing adherence.

Comparability

Guidance for Devolved Administrations

Wherever possible we aim to create questions that work for each of the four nations . However, health in the UK is a devolved issue which means that England, Northern Ireland, Scotland and Wales have taken responsibility for their own response to the pandemic and subsequent policies relating to public health.

This approach applies to the way lockdown restrictions have been eased across the UK, with devolved countries pursuing their own individual strategies on measures such as social distancing and extended households. It is therefore important to consider whether survey respondents are aware of these differences and how this might affect responses.

Different policies across the UK countries may also affect comparability of the outputs from these questions. For example higher numbers of key workers may be down to a broader definition of what a key worker is and higher levels of diagnosis may be a result of different policies on testing.

In assessing comparability of statistics on the coronavirus, we have found two domains that benefit from extra guidance: key worker status and diagnosis.

Key worker status

Why collect this data?

The purpose of the variable key worker status is to ascertain which workers’ children are still permitted to attend school during a time of restricted schooling provision as a result of the coronavirus.

Terms used

“Key worker” is the most common term used in the UK according to Google data, but the phrase “critical worker” is also sometimes used.

Geographical comparisons of key workers

The central UK government has provided a definition of what a key worker is based on sectors but also includes people “if [their] work is critical to the COVID-19 response”.

However, the definition varies slightly in the UK nations, with Scotland and Northern Ireland noting that it is flexible. Varied definitions across the UK may mean that UK-wide data is not always capturing the same thing in each nation, and as such may not be geographically comparable.

Self-identification of key workers

Because there is flexibility in definitions, outputs based on occupation or industry may not be comparable to outputs based on self-identification as a key worker. Those who self-identify as a key worker are likely to be acting as though they are a key worker (for example going to work) whether or not they meet industry and occupation definitions. This means that to understand service provision needs, capturing the number of people who self-identify as a key worker is the most beneficial.

Diagnosis

Surveys help us estimate cases

Without testing, we cannot know exactly how many people have the coronavirus. This means survey data on the topic is an estimate, and variance is to be expected.

Comparing survey data and test data

Survey questions are unlikely to be comparable to test data except in studies that use both survey and biological data.

One reason is that testing figures will miss cases because tests are only provided to a subset of people.

Another reason is survey questions which rely on self-reported symptoms will miss asymptomatic cases.

This means, testing data has higher accuracy, but survey data has more representative coverage. The decision of which of these is more appropriate for use will vary based on situation.   

Prevalence

When comparing data on prevalence of the coronavirus, it is important to also understand whether the data is reporting new cases, current cases or cumulative cases.

Cumulative survey data relates to questions that ask whether someone has had the coronavirus at all, which provides data that cannot be compared to questions asking about whether someone currently has the coronavirus.

The Department for Health and Social Care and Public Health England have a live tracker for both cumulative and new cases. As this is based on testing data, which is only available on a specific subset of people, it should not be compared to survey data that aims to achieve a representative sample.

Levels of prevalence will also vary based on levels of testing. As such, when levels of testing are known to vary between samples, this should be noted when comparing outputs.

Further information

Wellbeing and the coronavirus

Recent data from the Opinion and Lifestyle Survey found that the proportion of adults likely to be experiencing some form of depression during the coronavirus pandemic had almost doubled from before the pandemic (July 2019 to March 2020). Feeling stressed or anxious was the most common way adults experiencing some form of depression felt their well-being was being affected, with 84.9% stating this.

It is therefore recommended that if data collectors are interested in investigating this aspect of the pandemic then the harmonised standards on personal wellbeing should be used.

Social isolation

An area of interest in relation to the coronavirus pandemic is social isolation. Social isolation is different to physical isolation, as people may be physically isolated but still feel socially connected to others. GSS Harmonisation loneliness standard includes a question on frequency of social isolation:

Question stemResponse options
How often do you feel isolated from others?Hardly ever or never,
Some of the time
Often

And for children and young people the question is:

Question stemResponse option
How often do you feel alone?Hardly ever or never,
Some of the time,
Often

This question can be used alone to investigate frequency of feelings of social isolation, or alongside the other questions in the loneliness standard to investigate the wider concept.

Time series

Data can be compared over time to monitor change. This is sometimes called a time series. When there are changes in the way that data is collected we may say that the time series is broken. This means that the data before and after a point in time should not be compared to each other.

Due to the coronavirus pandemic, some surveys have had to change how they collect data. This includes moving data collection online, removing variables, or changing who is surveyed. All of these changes mean that data collected during the coronavirus pandemic may look different to data collected at other instances and therefore should not be compared over time. Although the data should not be compared, this does not intrinsically mean that one set of data is higher quality than the other.

To mitigate the impact that breaks in the time series have, data collectors can execute a parallel run. This is where both the original and the new data collection methodologies are run at the same time in parallel. This provides data from both methods covering the same time period. These data can then be analysed for comparability, which can then inform decisions on how data can or cannot be compared over a time period. Without this it is not possible to know whether the data can be compared.

An example of managing a break in the time series comes from the Office for National Statistics (ONS). The UK Labour Force Survey has been running in some form since the 1970s, but in the process of a wider data collection transformation programme the ONS have commenced research into how data collection can be performed online using a new, prototype Labour Market Survey.  To understand the impact of this, they completed a parallel run of core outputs, and published a comparative estimates report. The report notes that headline estimates show no significant differences, however there were differences at lower levels. This might suggest that headline estimates can be compared over time but detailed analysis should not be.

Relevant harmonised standards

To aid harmonisation, we recommend adopting other harmonised standards that may be relevant in data collection during this time, such as:

Before collecting further data on this topic, we also suggest looking at information that has already been published, for example:

Coronavirus (COVID-19) question bank

The ONS have recently compiled a bank of questions from surveys that ask questions related to coronavirus (COVID-19).

Coronavirus (COVID-19) question bank (XLSX, 280KB)

Surveys involved in development

This question bank has been developed using the following surveys:

  • COVID High Risk Group Insights Study (CEV) ​
  • COVID Test & Trace (T&T) Cases Insights Study ​
  • COVID Test & Trace (T&T) Contacts Insights Study
  • COVID-19 Infection Survey (CIS) ​
  • COVID-19 Schools Infection Survey (SIS) ​ 
  • Household Assets Survey (HAS)
  • International Arrivals Insights Survey (IAIS)​
  • International Passenger Survey (IPS) ​
  • Labour Force Survey (LFS)​
  • Labour Market Survey (LMS)​
  • Living and Foods Survey (LCF)​
  • Opinions and Lifestyles Survey (OPN) ​
  • Over 80s Vaccine Insights Study ​
  • Student Coronavirus Insights Study (SCIS) ​
  • Survey of Living Conditions (SLC)​
  • Time Use Survey (TUS) ​

More information about these surveys can be found on the ‘Metadata’ tab of the spreadsheet. The question bank will be updated monthly to include new questions and surveys. If you have any questions in relation to the question bank, then please use the following email address: question.bank@ons.gov.uk .

Intentions for use

The question bank is shared with two main intentions.

Provide a list of questions to be used in other surveys

The question bank and harmonised questions both cover similar topic areas, including impact on life, health and social contact.

When developing a new questionnaire, we recommend that you use the harmonised question first.

You should use the coronavirus (COVID-19) question bank if you need to harmonise a set of questions with a specific data source or assess how other surveys ask questions on topics not covered by the harmonised guidance.

Provide users with an understanding of what data the ONS has in relation to the coronavirus pandemic

This will allow specific analysis from us to be requested as not all data from questions asked on ONS surveys are published.

If you would like to request statistics or access the research data, please see requesting statistics for more information.

For the latest publication from these surveys, please look at the ‘Metadata’ tab of the bank.

Contact

We are always interested in hearing from users so we can develop our work. If you use or produce statistics based on this topic, please get in touch: gsshelp@statistics.gov.uk.

Review frequency:

This guidance will be reviewed regularly.

Updates

Date Changes
24 June 2021

Added the Coronavirus (COVID-19) Question Bank, compiled by ONS.

23 October 2020

The symptom question on diagnosis has been updated in line with changes to the UK Government official symptoms list for the coronavirus (COVID-19).

2 October 2020

A question on social distancing has been developed, as has further guidance on wellbeing, social isolation, time series, and use in devolved administrations.

Documents

File

Coronavirus_COVID19_Question_Bank (XLSX, 0.29MB)

Download Coronavirus_COVID19_Question_Bank

  • If you would like us to get in touch with you then please leave your contact details or email gsshelp@statistics.gov.uk directly.
  • This field is for validation purposes and should be left unchanged.