High quality insights from complex data sources

Natasha Bance

The content from the Government Statistical Service (GSS) website is moving. You can find this article on the AF blog. Please update any bookmarks you may have.

I started my career as a social epidemiologist nearly two decades ago. I soon spotted the opportunity to get involved with a brand-new birth cohort study. Over the years I’ve sat waiting for my Millennium Cohort Study babies to grow up. I’ve been eagerly pressing ‘refresh’ on the UK Data Service website every few years, ready to test out my latest ideas and hoping to provide new insights for social policy for the next generation.

While we have steadily discovered lots of things, the world of longitudinal research has faced an equally steady increase in data collection challenges in the new millennium. These challenges include:

These data are an important resource for identifying cause and effect measured over the long term. But the coronavirus (COVID-19) pandemic showed how important administrative data can be, and how it can help direct policy responses in almost real-time.

The benefits and challenges of administrative data

Administrative data has some unique challenges, especially when it comes to quality. It is usually collected for a specific reason, like running an organisation or providing a service. If administrative data is used for any other purpose it needs to be quality assessed for this new purpose. Luckily there are tools to help with this, such as the Administrative Data Quality Assurance Toolkit.

But there are benefits too. Administrative data may not be built for research purposes like observational studies, but they typically offer generous sample sizes. You can find more information about administrative data sample sizes on the NHS Digital website. Administrative data are useful because:

And administrative data has spread – it’s everywhere in government! The challenge is finding it, bringing it together in safe way within a trusted research environment, and understanding the quality of it. This is why the ONS has started an ambitious journey to create the Integrated Data Service (IDS). The IDS will be a secure platform to host administrative data from a wide range of government departments. It will make it quicker and easier for analysts to work together, and it will help improve the speed of decision-making across Government.

The Integrated Data Analysis Team (IDAT)

In preparation for the launch of the IDS, the IDAT has brought together a diverse analytical team of:

  • social researchers
  • statisticians
  • economists
  • operational researchers
  • data scientists

The team works with colleagues across the ONS, other government departments, the Government Statistical Service (GSS) and the private sector. IDAT aims to develop analysis using administrative data. The team aims to use this data to provide high quality insights to inform cross-cutting policy areas. The team also provides feedback to the developers of the IDS to help them create a platform that meets the needs of analysts.

The team uses a range of newly received administrative data to investigate a range of topics relevant to economic, social, and environmental policy. Recent work includes:

Ongoing projects within IDAT include:

  • analysis of the effect of childhood social care on educational attainment – this looks at the Growing up in England data from Census 2011 linked to the All Education Dataset for England
  • understanding links between educational attainment and contact with the criminal justice system in later life – this is based on Ministry of Justice information linked to education records
  • analysis of geographic mobility and earnings progression – this uses the Department for Work and Pensions’ Registration and Population Interaction Database (RAPID)
  • analysis of social effects on health and later routes through healthcare, using Census 2011 linked to Hospital Episodes Statistics
  • understanding the causes of house price inflation in England, Wales and Scotland, linking land registry data to a range of open data on the social, economic and demographic characteristics of neighbourhoods

New opportunities with administrative data

Administrative data offers new opportunities for insights and challenges to statistical researchers. But it can be enhanced further. Linkage between administrative sources offers more potential.

Of course, I can’t forget about my Millennium babies. They’re all grown up now and by linking survey responses to administrative education, health or income records we can understand more about their lives. We can study the things that make their lives easier or more difficult. And we can see how their experiences, and their administrative data, can be used for the public good.

Dr Neil Smith
Natasha Bance
Dr Neil Smith is a social epidemiologist leading the Integrated Data Analysis Team (IDAT) in the acquisition, linkage and analysis of administrative data in the Analytical Hub.