Skip to content

raw.tpp schema🔗

Available on backends: TPP

This schema defines the data (both primary care and externally linked) available in the OpenSAFELY-TPP backend. For more information about this backend, see "SystmOne Primary Care".

The data provided by this schema are minimally transformed. They are very close to the data provided by the underlying database tables. They are provided for data development and data curation purposes.

To use this schema in an ehrQL file:
from ehrql.tables.raw.tpp import (
    apcs_cost_historical,
    apcs_historical,
    covid_therapeutics_raw,
    isaric,
    medications,
    ons_deaths,
    wl_clockstops,
    wl_openpathways,
)

many rows per patient

apcs_cost_historical🔗

This table contains some historical APCS cost data.

It has been exposed to users for data exploration, and may be removed in future.

Columns
apcs_ident 🔗 integer
  • Never NULL
grand_total_payment_mff 🔗 float
tariff_initial_amount 🔗 float
tariff_total_payment 🔗 float
admission_date 🔗 date
discharge_date 🔗 date

many rows per patient

apcs_historical🔗

This table contains some historical APCS data.

It has been exposed to users for data exploration, and may be removed in future.

Columns
apcs_ident 🔗 integer
  • Never NULL
admission_date 🔗 date
discharge_date 🔗 date
spell_core_hrg_sus 🔗 string

many rows per patient

covid_therapeutics_raw🔗

The COVID Therapeutics dataset contains information on COVID treatments used in inpatient and outpatient settings.

Metadata

  • Data provider NHS England
  • Participation / Coverage Inpatients and outpatients treated with antivirals/nMABs for COVID-19 in England
  • Provenance Data sourced largely from BlueTeq system (forms completed by clinicians)
  • Update frequency in OpenSAFELY Approximately weekly
  • Delay between event occurring and event appearing in OpenSAFELY Approximately 2-9 days
  • Collected information Treatment start date; therapeutic intervention; COVID indication, current status, risk group, region

Overview

Antivirals and neutralising monoclonal antibodies (nMABs) for COVID-19 can be administered in inpatient setting or, for outpatients, in COVID Medicine Delivery Units (CMDUs) specifically set up for this purpose. For patients considered for these treatments, clinicians submit completed forms to NHS England. Each row represents one completed form for one course of treatment. Data received by OpenSAFELY currently covers patients who were approved for treatment. The patient may or may not have actually received the treatment or completed the course (but we assume that they usually do). They may have another form completed for another treatment, either because it was decided to give them a different treatment, or for some other reason. They may in theory also have another form completed some months later for another instance of infection.

Treatment dates may be in the past or future at the point when the form is submitted.

Note that this dataset contains some duplicate rows – some full duplicates and some partial duplicates.

More Information

Columns
current_status 🔗 string

Status of form/application.

  • Possible values: Approved, Treatment Complete, Treatment Not Started, Treatment Stopped
covid_indication 🔗 string

Treatment setting/indication.

  • Possible values: non_hospitalised, hospitalised_with, hospital_onset
diagnosis 🔗 string

Always has the value 'Covid-19'.

intervention 🔗 string

Intervention or therapeutic name. Expected to be one of:

  • Paxlovid
  • Sotrovimab
  • Molnupiravir
  • Remdesivir
  • Casirivimab and imdevimab
treatment_start_date 🔗 date

Entered by the clinician and can represent either a future planned start date or a past date at the time of form submission.

CASIM05_date_of_symptom_onset 🔗 string

Date of symptom onset (Casirivimab/imdevimab).

Not in a consistent format and so cannot be converted to a date column.

CASIM05_risk_cohort 🔗 string

High-risk group to which the patient was considered to belong. Derived from tick-boxes. Multiple groups can be selected and will be joined with the word and e.g. liver disease and rare neurological conditions.

The available groups as at the time of writing are listed below. However note that the precise wording used has changed over time and so filtering by a specific disease name may not be reliable.

  • Downs syndrome
  • HIV or AIDS
  • IMID
  • haematologic malignancy
  • Patients with a haematological diseases (sic)
  • immune deficiencies
  • liver disease
  • primary immune deficiencies
  • rare neurological conditions
  • rare neurological diseases
  • renal disease
  • sickle cell disease
  • solid cancer
  • solid organ recipients
  • stem cell transplant recipients
MOL1_onset_of_symptoms 🔗 string

Date of symptom onset (Molnupiravir).

Not in a consistent format and so cannot be converted to a date column.

MOL1_high_risk_cohort 🔗 string

High-risk group to which the patient was considered to belong. Derived from tick-boxes. Multiple groups can be selected and will be joined with the word and e.g. liver disease and rare neurological conditions.

The available groups as at the time of writing are listed below. However note that the precise wording used has changed over time and so filtering by a specific disease name may not be reliable.

  • Downs syndrome
  • HIV or AIDS
  • IMID
  • haematologic malignancy
  • Patients with a haematological diseases (sic)
  • immune deficiencies
  • liver disease
  • primary immune deficiencies
  • rare neurological conditions
  • rare neurological diseases
  • renal disease
  • sickle cell disease
  • solid cancer
  • solid organ recipients
  • stem cell transplant recipients
SOT02_onset_of_symptoms 🔗 string

Date of symptom onset (Sotrovimab).

Not in a consistent format and so cannot be converted to a date column.

SOT02_risk_cohorts 🔗 string

High-risk group to which the patient was considered to belong. Derived from tick-boxes. Multiple groups can be selected and will be joined with the word and e.g. liver disease and rare neurological conditions.

The available groups as at the time of writing are listed below. However note that the precise wording used has changed over time and so filtering by a specific disease name may not be reliable.

  • Downs syndrome
  • HIV or AIDS
  • IMID
  • haematologic malignancy
  • Patients with a haematological diseases (sic)
  • immune deficiencies
  • liver disease
  • primary immune deficiencies
  • rare neurological conditions
  • rare neurological diseases
  • renal disease
  • sickle cell disease
  • solid cancer
  • solid organ recipients
  • stem cell transplant recipients
form_name 🔗 string

Name and version of the patient registration form used to register the treatment.

received 🔗 date

Date form submitted.

count 🔗 integer

Number of forms.

age_at_received_date 🔗 integer

Can occasionally be zero, presumably indicating an unknown or missing value as minimum eligibility age is 12.

region 🔗 string

NHS England region in which the CMDU submitting the form is located.

load_date 🔗 date

Date on which the current dataset was imported.

many rows per patient

isaric🔗

ISARIC is a dataset of COVID-19-related hospital admissions, with coverage across the majority of hospitals across the UK, including much richer clinical information than collected in national Hospital Episode Statistics datasets.

The data in this table covers a subset of the ISARIC data columns available in TPP, sourced from the ISARIC COVID-19 Clinical Database.

All columns included have deliberately been taken as strings while in a preliminary phase.

Descriptions taken from CCP_REDCap_ISARIC_data_dictionary_codebook.pdf which also has information on the data expected for each column.

Warning

ISARIC data can only be used in collaboration with ISARIC researchers who must be involved in working on the study and writing it up.

Refer to the OpenSAFELY database build report to see when this data was last updated.

Columns
age 🔗 string

Age

age_factor 🔗 string

TODO

calc_age 🔗 string

Calculated age (comparing date of birth with date of enrolment). May be inaccurate if a date of February 29 is used.

sex 🔗 string

Sex at birth.

ethnic___1 🔗 string

Ethnic group: Arab.

ethnic___2 🔗 string

Ethnic group: Black.

ethnic___3 🔗 string

Ethnic group: East Asian.

ethnic___4 🔗 string

Ethnic group: South Asian.

ethnic___5 🔗 string

Ethnic group: West Asian.

ethnic___6 🔗 string

Ethnic group: Latin American.

ethnic___7 🔗 string

Ethnic group: White.

ethnic___8 🔗 string

Ethnic group: Aboriginal/First Nations.

ethnic___9 🔗 string

Ethnic group: Other.

ethnic___10 🔗 string

Ethnic group: N/A.

covid19_vaccine 🔗 string

Has the patient received a Covid-19 vaccine (open label licenced product)?

covid19_vaccined 🔗 date

Date first vaccine given (Covid-19) if known.

covid19_vaccine2d 🔗 date

Date second vaccine given (Covid-19) if known.

covid19_vaccined_nk 🔗 string

First vaccine given (Covid-19) but date not known.

corona_ieorres 🔗 string

Suspected or proven infection with pathogen of public health interest.

coriona_ieorres2 🔗 string

Proven or high likelihood of infection with pathogen of public health interest.

coriona_ieorres3 🔗 string

Proven infection with pathogen of public health interest.

inflammatory_mss 🔗 string

Adult or child who meets case definition for inflammatory multi-system syndrome (MIS-C/MIS-A).

cestdat 🔗 date

Onset date of first/earliest symptom.

chrincard 🔗 string

Chronic cardiac disease, including congenital heart disease (not hypertension).

  • Possible values: YES, NO, Unknown
hypertension_mhyn 🔗 string

Hypertension (physician diagnosed).

  • Possible values: YES, NO, Unknown
chronicpul_mhyn 🔗 string

Chronic pulmonary disease (not asthma).

  • Possible values: YES, NO, Unknown
asthma_mhyn 🔗 string

Asthma (physician diagnosed).

  • Possible values: YES, NO, Unknown
renal_mhyn 🔗 string

Chronic kidney disease.

  • Possible values: YES, NO, Unknown
mildliver 🔗 string

Mild liver disease.

  • Possible values: YES, NO, Unknown
modliv 🔗 string

Moderate or severe liver disease

  • Possible values: YES, NO, Unknown
chronicneu_mhyn 🔗 string

Chronic neurological disorder.

  • Possible values: YES, NO, Unknown
malignantneo_mhyn 🔗 string

Malignant neoplasm.

  • Possible values: YES, NO, Unknown
chronichaemo_mhyn 🔗 string

Chronic haematologic disease.

  • Possible values: YES, NO, Unknown
aidshiv_mhyn 🔗 string

AIDS/HIV.

  • Possible values: YES, NO, Unknown
obesity_mhyn 🔗 string

Obesity (as defined by clinical staff).

  • Possible values: YES, NO, Unknown
diabetes_type_mhyn 🔗 string

Diabetes and type.

  • Possible values: NO, 1, 2, N/K
diabetescom_mhyn 🔗 string

Diabetes with complications.

  • Possible values: YES, NO, Unknown
diabetes_mhyn 🔗 string

Diabetes without complications.

  • Possible values: YES, NO, Unknown
rheumatologic_mhyn 🔗 string

Rheumatologic disorder.

  • Possible values: YES, NO, Unknown
dementia_mhyn 🔗 string

Dementia.

  • Possible values: YES, NO, Unknown
malnutrition_mhyn 🔗 string

Malnutrition.

  • Possible values: YES, NO, Unknown
smoking_mhyn 🔗 string

Smoking.

  • Possible values: Yes, Never Smoked, Former Smoker, N/K
hostdat 🔗 date

Admission date at this facility.

hooccur 🔗 string

Transfer from other facility?

hostdat_transfer 🔗 date

Admission date at previous facility.

hostdat_transfernk 🔗 string

Admission date at previous facility not known.

readm_cov19 🔗 string

Is the patient being readmitted with Covid-19?

dsstdat 🔗 date

Date of enrolment.

dsstdtc 🔗 date

Outcome date.

many rows per patient

medications🔗

This table is an extension of the tpp.medications table.

It contains additional fields whose contents are not yet well understood, with the aim of facilitating exploratory analysis for data development and data curation purposes.

Columns
date 🔗 date
dmd_code 🔗 dm+d code
consultation_id 🔗 integer

ID of the consultation associated with this event

medication_status 🔗 integer

Medication status. The values might map to the descriptions below from the data dictionary. Note that this still needs to be confirmed.

  • 0 - Normal
  • 4 - Historical
  • 5 - Blue script
  • 6 - Private
  • 7 - Not in possession
  • 8 - Repeat dispensed
  • 9 - In possession
  • 10 - Dental
  • 11 - Hospital
  • 12 - Problem substance
  • 13 - From patient group direction
  • 14 - To take out
  • 15 - On admission
  • 16 - Regular medication
  • 17 - As required medication
  • 18 - Variable dose medication
  • 19 - Rate-controlled single regular
  • 20 - Only once
  • 21 - Outpatient
  • 22 - Rate-controlled multiple regular
  • 23 - Rate-controlled multiple only once
  • 24 - Rate-controlled single only once
  • 25 - Placeholder
  • 26 - Unconfirmed
  • 27 - Infusion
  • 28 - Reducing dose blue script

  • Always >= 0 and <= 28

many rows per patient

ons_deaths🔗

Registered deaths

Date and cause of death based on information recorded when deaths are certified and registered in England and Wales from February 2019 onwards. The data provider is the Office for National Statistics (ONS). This table is updated approximately weekly in OpenSAFELY.

This table includes the underlying cause of death, place of death, and up to 15 medical conditions mentioned on the death certificate. These codes (cause_of_death_01 to cause_of_death_15) are not ordered meaningfully.

More information about this table can be found in following documents provided by the ONS:

In the associated database table ONS_Deaths, a small number of patients have multiple registered deaths. This table contains all registered deaths. The ehrql.tables.ons_deaths table contains the earliest registered death.

Tip

To return one row per patient from ehrql.tables.raw.tpp.ons_deaths, for example the latest registered death, you can use:

ons_deaths.sort_by(ons_deaths.date).last_for_patient()
Columns
date 🔗 date

Patient's date of death.

place 🔗 string

Patient's place of death.

  • Possible values: Care Home, Elsewhere, Home, Hospice, Hospital, Other communal establishment
underlying_cause_of_death 🔗 ICD-10 code
cause_of_death_01 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_02 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_03 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_04 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_05 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_06 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_07 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_08 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_09 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_10 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_11 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_12 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_13 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_14 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

cause_of_death_15 🔗 ICD-10 code

Medical condition mentioned on the death certificate.

many rows per patient

wl_clockstops🔗

National Waiting List Clock Stops

The columns in this table have the same data types as the columns in the associated database table. The three "pseudo" columns are small exceptions, as they are converted from binary columns to string columns.

Columns
activity_treatment_function_code 🔗 string
priority_type_code 🔗 string
pseudo_organisation_code_patient_pathway_identifier_issuer 🔗 string
pseudo_patient_pathway_identifier 🔗 string
pseudo_referral_identifier 🔗 string
referral_request_received_date 🔗 string
referral_to_treatment_period_end_date 🔗 string
referral_to_treatment_period_start_date 🔗 string
source_of_referral_for_outpatients 🔗 string
waiting_list_type 🔗 string
week_ending_date 🔗 string

many rows per patient

wl_openpathways🔗

National Waiting List Open Pathways

The columns in this table have the same data types as the columns in the associated database table. The three "pseudo" columns are small exceptions, as they are converted from binary columns to string columns.

Columns
activity_treatment_function_code 🔗 string
current_pathway_period_start_date 🔗 string
priority_type_code 🔗 string
pseudo_organisation_code_patient_pathway_identifier_issuer 🔗 string
pseudo_patient_pathway_identifier 🔗 string
pseudo_referral_identifier 🔗 string
referral_request_received_date 🔗 string
referral_to_treatment_period_end_date 🔗 string
referral_to_treatment_period_start_date 🔗 string
source_of_referral 🔗 string
waiting_list_type 🔗 string
week_ending_date 🔗 string