GSS Quality Strategy case studies

These case studies provide examples of successful improvements to the quality of GSS statistics.

They are grouped by the goals in the GSS Quality Strategy that they most closely align with. They have been sourced from GSS Quality Champions and the OSR review on the State of the UK’s Statistical System. Additional case studies related to quality can also be found on the OSR Code of Practice for Statistics pages.

The case studies help provide evidence of implementing the GSS Quality Strategy and delivering the overall aim to “improve statistical quality across the GSS”.

For further information on any of these case studies, please email gsshelp@statistics.gov.uk.

We will add more case studies as they become available.

We will all understand the importance of our role in producing high quality statistics

Welsh Government have clear processes and procedures in place when it comes to their quality management. This approach is set out in their Statistical Quality Management Strategy.

One aspect of their quality management is their Statistical Quality Committee that meets quarterly. The committee comprises of a “quality” representative from each official statistics producing team and is always chaired by the Head of Profession or one of her deputies. The terms of reference for this committee sets out the purpose and role of the committee. This includes: reviewing Quality Incident reports; providing a challenge function for quality assurance processes; sharing best practice; provision of training; and reviewing emerging Office for Statistics Regulation assessment findings. Having this framework in place not only provides a structure for quality management but is a method for getting quality on the agenda and bringing it to the forefront of colleagues’ minds.

In 2019, Her Majesty’s Revenue and Customs (HMRC) invited the Office for Statistics Regulation (OSR) to review the quality of its official statistics, as part of its response to finding a significant error in its Corporation Tax statistics. OSR published the findings of its review in April 2020, which contained nine recommendations, including producing process maps of end-to-end processes, developing reproducible analytical pipelines and reducing HMRC’s suite of publications.

HMRC designed a programme of work to implement recommendations in 2021 to 2022 and beyond. Much has already been achieved. They have:

  • developed action plans and policy frameworks;
  • reviewed official statistics publications, consulted external users and published plans for reducing statistics publications and for quality improvements;
  • developed improved quality assurance guidance;
  • actively promoted a culture of data quality management, including development of a user engagement strategy to keep publications relevant and informative;
  • recruited new people to the eight new posts created under Spending Review 20;
  • developed knowledge-sharing networks and held a ‘RAP’ half day conference;
  • piloted ‘process mapping’ to understand production processes and identify risks, and agreed future plans.

The actions of HMRC statistics producers, leaders, the OSR review and ongoing support from OSR have all contributed to promoting the importance within HMRC of access to data for analysts and of understanding the nature and quality of the data available.

You can read more in this OSR case study.

The Office for National Statistics (ONS) have clear governance on quality to ensure everyone across the organisation understands their role in producing high quality statistics.  A key component of this is the new ONS Quality Committee that meets monthly and is chaired by the Deputy National Statistician. Deputy Directors (DDs) within ONS have primary responsibility for the quality of statistical outputs produced by their division. The role of the Quality Committee is to challenge the way in which quality is managed by DDs within their divisions as well as providing an overall approach that allows the quality of statistics to be periodically tested.

The Quality Committee have launched several initiatives to drive improvements in the quality of ONS statistics. The Committee have launched a self-assessment tool called the Statistical Quality Maturity Model (SQMM) which involves an assessment of the quality of all regular ONS experimental, official, or national statistic output as well as divisional level assessments on the culture on quality within ONS divisions. The collected data provides an incredibly rich data source about the quality of ONS statistics. The data helps identify cross-cutting quality issues across the organisation as well as being used by statistical output areas to outline the actions they need to take to improve the quality of their statistics. In addition, the Committee have launched a programme of quarterly quality deep dives against sets of statistics. These reviews look at the full data journey from collection of the data to publication of the statistics and aim to pick up any risks to quality. Practical recommendations are identified to improve quality and minimise the chance of errors occurring in future releases. So far, deep dives on trade statistics and GDP statistics have been completed with work plans in place to implement the quality improvement actions.

In addition, an ONS quality champions network has been established. Quality champions act as a central point of contact for advice on quality within divisions and the network shares best practice across the organisation on quality assurance, learning from errors and reporting on quality. A new quality statistics in government e-learning has also been launched which is being widely used by ONS staff.

This work has created real momentum in ONS and pushed quality high up the agenda. It is now clear to everyone across the organisation that they all have a role to play in improving the quality of ONS statistics. An improved culture is also starting to spread across the organisation with reporting of quality concerns viewed as a strength and not a weakness. The view is that it would be far worse to ignore quality concerns that then become errors which could damage the reputation of ONS. In addition, a growth mindset approach has been adopted with rare errors to ONS statistics viewed as a learning opportunity to improve our statistics rather than blaming individuals for what went wrong.

Further information on the ONS approach to improving statistical quality can be found in the ONS Statistical Quality Improvement Strategy.

We will ensure our data are of sufficient quality and communicate the quality implications to users

In 2019, the Quality Centre (now the Data Quality Hub) was commissioned by the Department for Levelling Up, Housing and Communities (DLUHC formerly known as Ministry of Housing, Communities and Local Government or MHCLG) to review the quality assurance (QA) processes in place for the rough sleeping statistics. This review aimed to identify the main strengths of current processes, as well as making recommendations for change and improvement.

Having met with DLUHC colleagues involved in producing rough sleeping statistics, the Quality Centre identified four key recommendations relating to the rough sleeping statistics publication itself as well as six additional recommendations to improve the QA processes. These included improving documentation, introducing Reproducible Analytical Pipeline (RAP) processes and using SQL.

The DLUHC team implemented the recommendations to great effect. The rough sleeping statistics published in February 2020 and 2021 included:

  • An HTML bulletin, which is clear and easy to navigate for users
  • An HTML technical report which provides comprehensive information on how the statistics are produced and quality assured – this helps users to understand the quality of the statistics
  • An interactive dashboard which enables users to explore the rough sleeping data and filter by year, region, and local authority.

You can read more in this blog.

The Office of Rail and Road (ORR) produces statistics on the number of rail passengers using each mainline station in Great Britain.

In 2020, these statistics were designated as National Statistics by the United Kingdom Statistics Authority. ORR worked collaboratively with the Office for Statistics Regulation who assessed the statistics, making a number of improvements to the quality of these statistics and communicating quality implications to users. Improvements included an extended statistical release with more information on the quality and limitations of the statistics. More information on any limitations was also published alongside each station’s usage estimate in their data tables and interactive dashboard. Further detail was published in a new quality and methodology report. ORR also introduced a new infographic for users on how the statistics can and can’t be used. Engagement with both suppliers and users was stepped up during 2020, including sharing draft estimates to improve quality assurance and to gain further insight on large or unexpected changes in usage at some stations.

The Office of Rail and Road (ORR) produces statistics on delay compensation claims and passenger rail service complaints.

ORR undertakes continuous proactive engagement and holds an annual workshop with the data suppliers (train operators) of complaints and delay compensation data. This communication has helped to strengthen both ORR’s and train operators’ understanding of how and why the data are collected. Also, this has ensured data is provided to ORR on a consistent basis by all 23 operators, improving the quality of the data received. In addition, ORR requires train operators to each sign a letter at the outset of each reporting year to provide assurance that the data provided by them has been submitted following the consistent approach set out in the ORR ‘Core Data’ guidance and is an accurate reflection of their performance. This guidance lists the checks ORR conduct on the data supplied so operators can review their data against these prior to submission, which reduces the number of data queries raised by ORR, and therefore the need for resubmission.

We will anticipate emerging trends and changes and prepare for them using innovative methods

Each month, the UK House Price Index (HPI) presents a first estimate of average house prices in the UK based on the available sales transactions data for the latest reference period. The first estimate is then updated in subsequent months as more sales transaction data become available. In March 2017, there was a large increase in the number of revisions between first and subsequent estimates. This negatively affected some users’ confidence in UK HPI.

After investigating, the Office for National Statistics (ONS) established that they were being driven by volatility in new build property prices, compounded by an operational backlog in Her Majesty’s Land Registry registering new build sales transactions. Steps were taken to improve the methods by changing the calculation for the first estimate to reduce its sensitivity to the impact of new build transactions. The approach was developed by GSS methodologists, and several options were tested before a final one was chosen.

As a result, the scale of revisions to the first estimate of UK HPI annual change to average house prices has reduced and is more stable over time. This is an example of where an external change called for innovative methods to be developed to improve the quality of the statistics. Further information on this case study can be found in the Code of Practice for Statistics: Q2 case study.

The Consumer Price Index (CPI) measures aggregate price change of consumer goods and services. The Prices Alternative Data Sources project is exploring using modern data sources for compiling CPI. Some of these datasets are much bigger than what is used traditionally. For example, clothing data are being web scraped from several retailers, creating a dataset composed of approximately 500,000 unique products per month. These products cannot be individually scrutinised in the same way the sample is done traditionally, and so automated methods of classification are required. Researchers are investigating the use of innovative methods such as supervised machine learning for classifying products into types such as women’s t-shirts and boys’ trousers. Further information can be found in this Office for National Statistics report.

Until March 2020, the Crime Survey for England and Wales (CSEW) ran as a face-to-face survey. The advent of the COVID-19 pandemic meant this had to be suspended, so the Centre for Crime and Justice (CCJ) in the Office for National Statistics (ONS) worked to establish a telephone survey to replace the face-to-face survey. This Telephone-operated Crime Survey for England and Wales (TCSEW) launched in May 2020, with the crime statistics based on data from the TCSEW published as experimental statistics to reflect the change in methodology.

Changes to sample design and the number of questions asked were just some of the adaptations that had to be made at short notice, and the ONS has been proactive in communicating these changes to the survey. They published a report on the comparability of TCSEW data with face-to-face CSEW data, helping users to understand what comparisons can be made.

Despite the many challenges, over 36,000 interviews took place over the survey period. This new approach, developed swiftly in response to the pandemic, has been endorsed by the Office for Statistics Regulation following a rapid review. Read Ed Humpherson’s letter on the survey.

The Office for National Statistics (ONS) launched the Coronavirus (COVID-19) Infection Survey in swift response to the pandemic, just weeks after the first UK lockdown in March 2020. Over the following months, ONS increased the study from a survey of around 28,000 people in England, to over 150,000 people from across the UK by October 2020. It is the largest and only representative survey of COVID-19 infections in the community and follows up participants for up to 16 months. The survey provides high-quality estimates of the percentage of people testing positive for coronavirus and antibodies against coronavirus. As such, these statistics provide vital insights into the pandemic for a wide range of users, including government decision-makers, scientists, the media and the public that are essential for understanding the spread of the virus, including the new variants.

Of particular note was the speed at which resources were reprioritised within ONS to allow staff to work on the survey, and the strong working relationships established between ONS analysts, the analytical teams across the devolved nations, the survey contractor IQVIA, and the academic partners at the Universities of Oxford and Manchester. ONS both responded to user needs (e.g. through responding to user requests) and proactively anticipated what would be of interest in the future and should be included in the survey (e.g. cases of long covid or statistics on antibodies and vaccinations). This work won the collaboration award as part of the Analysis in Government Awards 2020.

We will implement automated processes to make our analysis reproducible

The Department for Transport (DfT) highly encourage implementing reproducible analytical pipelines (RAP) into their processes, and statistics teams across the department now use RAP across a number of different publication processes. Teams have developed automated validation checks and validation dashboards as well as developing quality metrics to quality assure publication data. Some teams have also produced automated summary QA notes for managers to see the latest trends and sign off on the data. In most cases where it is feasible, teams have automated the production of spreadsheet tables for publication – further reducing the risk of human error. The department are now turning their attention to producing automated statistical releases in html, as well as automating charts for the releases.

To support this work there is a RAP committee which is made up of representatives across the different statistics divisions in the department. The committee have compiled and put together a number of resources such as an online codebook to support coding, and have produced a checklist for teams to use to ensure that RAP projects are developed to a suitable standard (e.g. appropriate peer review, including correct folder set-up, including good documentation). The committee also organise regular RAP user group meetings where statisticians are invited to present their latest RAP projects and bring any queries or troubleshooting requests to the group. The committee also invite presenters from other departments to encourage innovation and new thinking within the department.

In 2019, the Centre for Crime and Justice (CCJ) began transformation plans to automate the most repetitive, resource-intensive elements of the Crime Survey for England and Wales (CSEW) statistical pipeline, with the aim of transforming how outputs and tables were produced.

In March 2020, the CCJ began work with Best Practice and Impact Division (BPI) on automating their Nature of Crime tables. The team built their core knowledge of coding tools and made use of bespoke “just-in-time” learning facilitated by BPI, undergoing training sessions on new concepts and unfamiliar tasks as they arose.

The project has developed the team’s coding skills and understanding of best practice, and has improved their production process: what was once a three week sprint for 13 analysts is now just a few hours work for two members of the RAP team, saving 1500 person hours. It has also helped to reduce errors, thereby reducing the number of revisions made to National Statistics

The 100+ fully formatted tables were published in September 2020. Created through reproducible analytical pipelines (RAP) in R and Python, the code is available on the CCJ GitHub. The team plan to use their learning and apply this approach to more of their outputs going forward.

You can read more in this blog.

Related