Health News

OpenSAFELY-TPP data is broadly representative of the English population

In a recent study posted to the medRxiv* preprint server, researchers evaluated the representativeness of the OpenSAFELY-TPP data to the general English population.

Study: OpenSAFELY: Representativeness of Electronic Health Record platform OpenSAFELY-TPP data compared to the population of England. Image Credit: ivector/Shutterstock


The present study's authors developed OpenSAFELY, a health analytics platform that analyzes primary care patient records. This software is deployed within the data centers of EMIS and TPP, the two largest providers of electronic health records (EHRs) in the National Health Service (NHS). So far, over 20 publications have reported using OpenSAFELY with a focus on delivering critical findings related to the coronavirus disease 2019 (COVID-19) pandemic.

EHR data are valuable tools for health research, albeit data collection is principally for clinical use and not for research. Because these are not random population samples, it is crucial to study and understand the representativeness of these data.

About the study

In the present study, researchers compared the demographic characteristics of TPP-registered patients to the Office for National Statistics (ONS) estimates. Primary care records maintained by TPP were linked to data on deaths from ONS through the OpenSAFELY-TPP platform. This dataset was based on 24 million people and included pseudonymized data like diagnoses, physiologic parameters, and medications.

Ethnicity data were derived from the 2011 United Kingdom (UK) Census. Further, data on age, sex, and index of multiple deprivation (IMD) were collected from the 2020 mid-year ONS estimates. Additionally, estimates of the five most common causes of mortality in 2020 were retrieved from ONS mortality statistics. Ethnicity data were stratified into five high-level classes: White, Black, South Asian, Mixed, and Others.

Patients were grouped into five groups based on common causes of death: 1) COVID-19, 2) ischemic heart diseases, 3) dementia and Alzheimer’s disease, 4) malignant neoplasm of trachea, bronchus, and lung, and 5) cerebrovascular diseases. The representativeness of TPP data was determined by comparing OpenSAFELY-TPP-derived estimates for 2020 with the ONS estimates of IMD, age, ethnicity, sex, and causes of death.


As of June 30, 2020, more than 24 million active people were registered with the TPP, representing 42.6% of the UK’s population. As a proportion of the ONS population, TPP coverage was the highest in East England (90.5%) and the lowest in West Midlands (16.8%). There were minor differences in IMD between ONS and TPP populations.

Where recorded, a marginally higher proportion of TPP patients were observed in the most deprived IMD groups 1 and 3 relative to ONS estimates. IMD was unavailable in 2.3% of TPP records. The proportion of people by sex was similar in TPP and ONS populations, with 50% and 50.1% females, respectively.

When stratified by age, those aged 25 to 59 were substantially higher in the TPP population than ONS. Higher proportions of males aged 35 – 59 and females aged 20 – 29 were evident in the TPP population relative to ONS estimates. The proportion of the five most common causes of death was lower in the TPP population than in ONS, with COVID-19 showing the highest difference and malignant neoplasm of trachea, bronchus, and lung showing a minor difference.

Ethnically, the differences between the two populations were within one percentage point. Asians were over-represented in the TPP population across all regions except the North West and South East regions. White people were under-represented in all UK regions except North West. More than 9% of the TPP population had no ethnicity-related information.

When the high-level ethnic groups were further categorized, the TPP population had a lower percentage of British White people but a higher proportion of Other White patients than the ONS population. African and Caribbean Black patients were under-represented in the TPP population.


The study noted that the TPP population largely represented the English population. Although the authors found high regional variability in the coverage of OpenSAFELY-TPP data among English GPs, broad within-region similarities were also observed. Notably, people aged 25 – 50 years in London were over-represented in TPP data. Ethnicity data were available from the 2011 UK census, and the ethnic composition might have changed over the last decade.

In summary, the present analysis revealed that TPP patients represented a sample of the general English population in sex, age, ethnicity, and IMD. These findings inform the interpretation of numerous completed/published analyses that used OpenSAFELY-TPP and reassure that such investigations are not marred by the generalizability/interpretive challenges.

*Important notice

medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
  • Andrews, C. et al. (2022) "OpenSAFELY: Representativeness of Electronic Health Record platform OpenSAFELY-TPP data compared to the population of England". medRxivdoi: 10.1101/2022.06.23.22276802.

Posted in: Medical Science News | Medical Research News

Tags: Coronavirus, covid-19, Dementia, Electronic Health Record, Heart, Malignant, Mortality, Neoplasm, Pandemic, Primary Care, Research

Comments (0)

Written by

Tarun Sai Lomte

Tarun is a writer based in Hyderabad, India. He has a Master’s degree in Biotechnology from the University of Hyderabad and is enthusiastic about scientific research. He enjoys reading research papers and literature reviews and is passionate about writing.

Source: Read Full Article