What is data linking? Why do you need to link?
Data linking means bringing together more than one data collection in a secure environment − for example, linking information relating to a person’s health to information relating to the same person’s educational record. Doing this can significantly improve research, because it produces a richer set of information that can result in a deeper understanding of society.
How do you link administrative data?
There are a range of techniques that can be applied to linking data, and part of the Network’s remit was exploring and improving techniques to link data accurately, safely and lawfully. Importantly, the security principles of the process were the same across the Network.
The data custodian split its data into two parts:
- The information (such as names, full addresses, dates of birth, etc.) which can directly identify individuals (identifier data)
- The rest of the information (attribute data) which is of interest to the researcher (for example, information on tax paid, benefits received or educational qualification achieved) without any direct identifiers.
Both parts are given a reference number.
A trusted third party matched the information by using the unique reference numbers and the identifying information. The identifying information was then destroyed, leaving only the matched unique reference numbers − called a linkage key. The Administrative Data Research Centre then linked the de-identified attribute data using the linkage key.
Researchers were then able to access the linked attribute data in a secure room at one of our secure environments. This data file had all the information needed for research but no identifying information.
What information will be in the linked data collection?
The information in the linked data collection is different for each project, depending on what the researcher is investigating. Researchers work with ‘de-identified’ administrative data, which means that all directly identifying information is removed. Researchers need to be able to build up consistent pictures so they can identify trends and patterns in the general population. This in turn shapes the strategies and policies that promote social well-being.