Learning about Reproducible Analytical Pipelines (RAP): two weeks with the GSS Good Practice Team
My name is Kin-Chung OW and I have just completed a two week secondment with the GSS Good Practice Team (GPT) from the Department for Education, working on a case study on Reproducible Analytical Pipelines (RAP) in R.
If you are not familiar with RAP, do not worry! You are not alone.
This was an area I had very little knowledge of, before I started my secondment.
The good news is that I was able to learn a tremendous amount in a very short period of time.
And so can you if you are interested in this area of work.
Some background on RAP
RAP is an innovative approach to creating analytical products that can be easily reproduced, tested and audited in real-time. It can be a very effective tool to automate the development, presentation and quality assurance of statistical bulletins.
This idea was first piloted in the GSS in 2017 by colleagues in the Department for Digital, Culture, Media & Sport (DCMS) and the Department for Education (DfE), collaborating with data scientists from the Government Digital Service (GDS) to automate the production of a statistical bulletin.
Where should I start from?
My advice for a good starting point to learn about RAP is to review existing literature and blogs.
I would recommend, in particular, having a look at these resources:
- An amusing and suitably titled blog from the Director General of the Office for Statistics Regulation, Ed Humpherson: A robot by any name?, explaining clearly what RAP is for a non-technical audience
- An article from Matt Upson from the Government Digital Service, which I found very insightful and helpful to gain an initial overview of what these pipelines are and how they work
- Discussion threads available online emphasising how beneficial RAP could be for official statistics production, e.g. Transforming the process of producing official statistics
I would also encourage attending presentations and seminars. During my secondment I was lucky to hear about the experience of using RAP directly from government analysts. This enabled me to ask questions and understand how this tool has been applied in practice.
Completing this online RAP training course, whilst seeing some of the work of GDS shared on GitHub, allowed me to fully appreciate the power of this tool and start practicing with some of the packages.
Playing with the data has made me realise that application of RAP is not just for the production of statistical bulletins.
This technique is very flexible and can certainly be adapted to solve different analytical problems quickly, as it enables automated processing, quality assuring and reporting of information.
The training made me think about application to my job at DfE, where I analyse and report on performance data on the national curriculum.
I think this approach could be beneficial to many other organisations, but there are some challenges…
Like any new approach, RAP requires investing time and resources to learn the new techniques and changing the way in which organisations operate.
Logistics and technology can also be tricky. To embed RAP in the work of organisations involves lots of testing, experimentation and identification of the right projects in which to use RAP.
However, current initiatives provide evidence that these challenges can be overcome!
Organisations such as Cabinet Office, DCMS, DfE and the Ministry of Justice have been promoting with RAP for quite some time. And there are already some excellent success stories.
Discussions are currently taking place at a senior level (the Heads of Profession for Statistics had RAP as an agenda item at one of their regular meetings in December).
The Office for National Statistics (ONS) are currently working with others on the GSS Data Project (see the blog GangStaS Data RAP), creating some useful infrastructure that will benefit the development of RAP.
In the next few months, the GSS Good Practice Team will be coordinating a cross-government approach to scale up support for RAP, supported by GDS, the ONS Data Science Campus, and RAP developers across the GSS (see the news article RAP Champions meet for the first time).
So watch this space! The use of RAP is going to grow!
Is it worth investing some time learning about RAP?
In my opinion, YES!
RAP can be a highly efficient and new way of working which would ultimately compress and automate lots of tedious and yet important, processes in our daily jobs.
The R software is continuously evolving and the community is expanding every day. The relevant skills are transferable to any analyst role and I am confident that RAP will open up new opportunities across the GSS.
Like any new technology, there may be a few teething issues at the start of implementation. However, in the words of a fellow DfE analyst, “Every problem has a solution!”. I am optimistic that these issues will be overcome.
No pain, no gain!
If you are interested in a secondment with the GSS Good Practice Team, please send an email to firstname.lastname@example.org.