Data linkage in Python
- Open to
- Government analysts
- Training category
- Analytical
- Type of training
- Online
- Length
- 6 hours
- Organiser
- Analysis Function Capability Team
- Provider
- Analysis Function Capability Team
- Location
- Online
Description
Performing data linkage is the process of joining multiple datasets together and linking records. It ensures that the resources spent collecting data are most effectively used by increasing the ways each dataset can be used for various research needs. This course aims to cover the practical application of linking data in Python. A similar course will be available for those who prefer working in R.
Learning outcomes
After taking this course you should be able to conduct:
- pre-linkage (preparing data)
- exact matching
- rule-based matching
- Score-based matching
- Fellegi-Sunter probabilistic matching
- post-linkage and quality evaluation
How to book
Please use your Learning Hub account to access the course on-line. Alternatively, please email GSS.Capability@statistics.gov.uk.