# Language reference

## Framesπ

Frames are the starting point for building any query in ehrQL. You can think of a Frame as being like a table in a database, in that it contains multiple rows and multiple columns. But a Frame can have operations applied to it like filtering or sorting to produce a new Frame.

You don't need to define any Frames yourself. Instead you import them
from the various schemas available in `ehrql.tables`

e.g.

```
from ehrql.tables.beta.core import patients
```

Frames have columns which you can access as attributes on the Frame e.g.

```
dob = patients.date_of_birth
```

The schema documentation contains the full list of
available columns for each Frame. For example, see
`ehrql.tables.beta.core.patients`

.

Accessing a column attribute on a Frame produces a Series, which are documented elsewhere below.

Some Frames contain at most one row per patient, we call these PatientFrames; others can contain multiple rows per patient, we call these EventFrames.

####
*class* **PatientFrame**()
π

*class*

**PatientFrame**()

Frame containing at most one row per patient.

`π`

**exists_for_patient**()Return a boolean patient series which is True for each patient that has a row in this frame and False otherwise.

`π`

**count_for_patient**()Return an integer patient series giving the number of rows each patient has in this frame.

Note this will be 0 rather than NULL if the patient has no rows at all in the frame.

####
*class* **EventFrame**()
π

*class*

**EventFrame**()

Frame which may contain multiple rows per patient.

`π`

**where**(*condition*)Return a new frame containing only the rows in this frame for which `condition`

evaluates True.

Note that this excludes any rows for which `condition`

is NULL.

`π`

**except_where**(*condition*)Return a new frame containing only the rows in this frame for which `condition`

evaluates False or NULL i.e. the exact inverse of the rows included by
`where()`

.

`π`

**sort_by**(**sort_values*)Sort the rows for each patient by each of the supplied `sort_values`

.

Where more than one sort value is supplied then the first (i.e. left-most) value has highest priority and each subsequent sort value will only be used as a tie-breaker in case of an exact match among previous values.

Note that NULL is considered smaller than any other value, so you may wish to filter out NULL values before sorting.

`π`

**exists_for_patient**()Return a boolean patient series which is True for each patient that has a row in this frame and False otherwise.

`π`

**count_for_patient**()Return an integer patient series giving the number of rows each patient has in this frame.

Note this will be 0 rather than NULL if the patient has no rows at all in the frame.

####
*class* **SortedEventFrame**()
π

*class*

**SortedEventFrame**()

Frame which contains multiple rows per patient and has had one or more sort operations applied.

`π`

**where**(*condition*)Return a new frame containing only the rows in this frame for which `condition`

evaluates True.

Note that this excludes any rows for which `condition`

is NULL.

`π`

**except_where**(*condition*)Return a new frame containing only the rows in this frame for which `condition`

evaluates False or NULL i.e. the exact inverse of the rows included by
`where()`

.

`π`

**sort_by**(**sort_values*)Sort the rows for each patient by each of the supplied `sort_values`

.

Where more than one sort value is supplied then the first (i.e. left-most) value has highest priority and each subsequent sort value will only be used as a tie-breaker in case of an exact match among previous values.

Note that NULL is considered smaller than any other value, so you may wish to filter out NULL values before sorting.

`π`

**exists_for_patient**()Return a boolean patient series which is True for each patient that has a row in this frame and False otherwise.

`π`

**count_for_patient**()Return an integer patient series giving the number of rows each patient has in this frame.

Note this will be 0 rather than NULL if the patient has no rows at all in the frame.

`π`

**first_for_patient**()Return a PatientFrame containing, for each patient, the first matching row according to whatever sort order has been applied.

Note that where there are multiple rows tied for first place then the specific row returned is picked arbitrarily but consistently i.e. you shouldn't depend on getting any particular result, but the result you do get shouldn't change unless the data changes.

`π`

**last_for_patient**()Return a PatientFrame containing, for each patient, the last matching row according to whatever sort order has been applied.

Note that where there are multiple rows tied for last place then the specific row returned is picked arbitrarily but consistently i.e. you shouldn't depend on getting any particular result, but the result you do get shouldn't change unless the data changes.

## Seriesπ

A Series represents a column of values of a certain type. Some Series contain at most one value per patient, we call these PatientSeries; others can contain multiple values per patient, we call these EventSeries. Values can be NULL (i.e. missing) but a Series can never mix values of different types.

####
*class* **BoolPatientSeries**()
π

*class*

**BoolPatientSeries**()

One row per patient series of type `boolean`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***&***other*Logical AND

Return a boolean series which is True where both this series and `other`

are
True, False where either are False, and NULL otherwise.

`π`

*self***|***other*Logical OR

Return a boolean series which is True where either this series or `other`

is
True, False where both are False, and NULL otherwise.

`π`

**~***self*Logical NOT

Return a boolean series which is the inverse of this series i.e. where True becomes False, False becomes True, and NULL stays as NULL.

`π`

**is_null**()Return a boolean series which is True for each value in this series which is NULL, and False otherwise.

`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)Return a boolean series which is True for each value in this series which is
contained in `other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)Accepts a dictionary mapping one set of values to another and applies that mapping to the series e.g.

```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

####
*class* **BoolEventSeries**()
π

*class*

**BoolEventSeries**()

Multiple rows per patient series of type `boolean`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***&***other*Logical AND

Return a boolean series which is True where both this series and `other`

are
True, False where either are False, and NULL otherwise.

`π`

*self***|***other*Logical OR

Return a boolean series which is True where either this series or `other`

is
True, False where both are False, and NULL otherwise.

`π`

**~***self*Logical NOT

Return a boolean series which is the inverse of this series i.e. where True becomes False, False becomes True, and NULL stays as NULL.

`π`

**is_null**()Return a boolean series which is True for each value in this series which is NULL, and False otherwise.

`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)Return a boolean series which is True for each value in this series which is
contained in `other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)Accepts a dictionary mapping one set of values to another and applies that mapping to the series e.g.

```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**count_distinct_for_patient**()Return a integer patient series counting the number of distinct values for each patient in the series (ignoring any NULL values). Not that if a patient has no values at all in the series the result will be zero rather than NULL.

####
*class* **StrPatientSeries**()
π

*class*

**StrPatientSeries**()

One row per patient series of type `string`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*Return a boolean series which is True for each value in this series that is
strictly less than its corresponding value in `other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*Return a boolean series which is True for each value in this series that is less
than or equal to its corresponding value in `other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*Return a boolean series which is True for each value in this series that is
greater than or equal to its corresponding value in `other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*Return a boolean series which is True for each value in this series that is
strictly greater than its corresponding value in `other`

and False otherwise (or
NULL if either value is NULL).

`π`

**is_null**()Return a boolean series which is True for each value in this series which is NULL, and False otherwise.

`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)Return a boolean series which is True for each value in this series which is
contained in `other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)Accepts a dictionary mapping one set of values to another and applies that mapping to the series e.g.

```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**contains**(*other*)Return a boolean series which is True for each string in this series which
contains the corresponding value in `other`

as a sub-string and False otherwise (or
NULL if either value is NULL).

####
*class* **StrEventSeries**()
π

*class*

**StrEventSeries**()

Multiple rows per patient series of type `string`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*Return a boolean series which is True for each value in this series that is
strictly less than its corresponding value in `other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*Return a boolean series which is True for each value in this series that is less
than or equal to its corresponding value in `other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*Return a boolean series which is True for each value in this series that is
greater than or equal to its corresponding value in `other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*Return a boolean series which is True for each value in this series that is
strictly greater than its corresponding value in `other`

and False otherwise (or
NULL if either value is NULL).

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**contains**(*other*)Return a boolean series which is True for each string in this series which
contains the corresponding value in `other`

as a sub-string and False otherwise (or
NULL if either value is NULL).

`π`

**count_distinct_for_patient**()Return a integer patient series counting the number of distinct values for each patient in the series (ignoring any NULL values). Not that if a patient has no values at all in the series the result will be zero rather than NULL.

`π`

**minimum_for_patient**()Return the minimum value in the series for each patient (or NULL if the patient has no values).

`π`

**maximum_for_patient**()Return the maximum value in the series for each patient (or NULL if the patient has no values).

####
*class* **IntPatientSeries**()
π

*class*

**IntPatientSeries**()

One row per patient series of type `integer`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*Return a boolean series which is True for each value in this series that is
strictly less than its corresponding value in `other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*Return a boolean series which is True for each value in this series that is less
than or equal to its corresponding value in `other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*Return a boolean series which is True for each value in this series that is
greater than or equal to its corresponding value in `other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*Return a boolean series which is True for each value in this series that is
strictly greater than its corresponding value in `other`

and False otherwise (or
NULL if either value is NULL).

`π`

*self***+***other*Return the sum of each corresponding value in this series and `other`

(or NULL
if either is NULL).

`π`

*self***-***other*Return each value in this series with its corresponding value in `other`

subtracted (or NULL if either is NULL).

`π`

*self*******other*Return the product of each corresponding value in this series and `other`

(or
NULL if either is NULL).

`π`

*self***/***other*Return a series with each value in this series divided by its correponding value
in `other`

(or NULL if either is NULL).

Note that the result is always if a float even if the inputs are integers.

`π`

*self***//***other*Return a series with each value in this series divided by its correponding value
in `other`

and then rounded **down** to the nearest integer value (or NULL if either
is NULL).

Note that the result is always if an integer even if the inputs are floats.

`π`

**-***self*Return the negation of each value in this series.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**as_int**()Return each value in this series rounded down to the nearest integer.

`π`

**as_float**()Return each value in this series as a float e.g 10 becomes 10.0

####
*class* **IntEventSeries**()
π

*class*

**IntEventSeries**()

Multiple rows per patient series of type `integer`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*`other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*`other`

and False otherwise (or
NULL if either value is NULL).

`π`

*self***+***other*Return the sum of each corresponding value in this series and `other`

(or NULL
if either is NULL).

`π`

*self***-***other*Return each value in this series with its corresponding value in `other`

subtracted (or NULL if either is NULL).

`π`

*self*******other*Return the product of each corresponding value in this series and `other`

(or
NULL if either is NULL).

`π`

*self***/***other*Return a series with each value in this series divided by its correponding value
in `other`

(or NULL if either is NULL).

Note that the result is always if a float even if the inputs are integers.

`π`

*self***//***other*Return a series with each value in this series divided by its correponding value
in `other`

and then rounded **down** to the nearest integer value (or NULL if either
is NULL).

Note that the result is always if an integer even if the inputs are floats.

`π`

**-***self*Return the negation of each value in this series.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**as_int**()Return each value in this series rounded down to the nearest integer.

`π`

**as_float**()Return each value in this series as a float e.g 10 becomes 10.0

`π`

**count_distinct_for_patient**()Return a integer patient series counting the number of distinct values for each patient in the series (ignoring any NULL values). Not that if a patient has no values at all in the series the result will be zero rather than NULL.

`π`

**minimum_for_patient**()Return the minimum value in the series for each patient (or NULL if the patient has no values).

`π`

**maximum_for_patient**()Return the maximum value in the series for each patient (or NULL if the patient has no values).

`π`

**sum_for_patient**()Return the sum of all values in the series for each patient.

`π`

**mean_for_patient**()Return the arithmetic mean of any non-NULL values in the series for each patient.

####
*class* **FloatPatientSeries**()
π

*class*

**FloatPatientSeries**()

One row per patient series of type `float`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*`other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*`other`

and False otherwise (or
NULL if either value is NULL).

`π`

*self***+***other*Return the sum of each corresponding value in this series and `other`

(or NULL
if either is NULL).

`π`

*self***-***other*Return each value in this series with its corresponding value in `other`

subtracted (or NULL if either is NULL).

`π`

*self*******other*Return the product of each corresponding value in this series and `other`

(or
NULL if either is NULL).

`π`

*self***/***other*Return a series with each value in this series divided by its correponding value
in `other`

(or NULL if either is NULL).

Note that the result is always if a float even if the inputs are integers.

`π`

*self***//***other*Return a series with each value in this series divided by its correponding value
in `other`

and then rounded **down** to the nearest integer value (or NULL if either
is NULL).

Note that the result is always if an integer even if the inputs are floats.

`π`

**-***self*Return the negation of each value in this series.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**as_int**()Return each value in this series rounded down to the nearest integer.

`π`

**as_float**()Return each value in this series as a float e.g 10 becomes 10.0

####
*class* **FloatEventSeries**()
π

*class*

**FloatEventSeries**()

Multiple rows per patient series of type `float`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*`other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*`other`

and False otherwise (or
NULL if either value is NULL).

`π`

*self***+***other*Return the sum of each corresponding value in this series and `other`

(or NULL
if either is NULL).

`π`

*self***-***other*`other`

subtracted (or NULL if either is NULL).

`π`

*self*******other*Return the product of each corresponding value in this series and `other`

(or
NULL if either is NULL).

`π`

*self***/***other*`other`

(or NULL if either is NULL).

Note that the result is always if a float even if the inputs are integers.

`π`

*self***//***other*`other`

and then rounded **down** to the nearest integer value (or NULL if either
is NULL).

Note that the result is always if an integer even if the inputs are floats.

`π`

**-***self*Return the negation of each value in this series.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**as_int**()Return each value in this series rounded down to the nearest integer.

`π`

**as_float**()Return each value in this series as a float e.g 10 becomes 10.0

`π`

**count_distinct_for_patient**()`π`

**minimum_for_patient**()Return the minimum value in the series for each patient (or NULL if the patient has no values).

`π`

**maximum_for_patient**()Return the maximum value in the series for each patient (or NULL if the patient has no values).

`π`

**sum_for_patient**()Return the sum of all values in the series for each patient.

`π`

**mean_for_patient**()Return the arithmetic mean of any non-NULL values in the series for each patient.

####
*class* **DatePatientSeries**()
π

*class*

**DatePatientSeries**()

One row per patient series of type `date`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*`other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*`other`

and False otherwise (or
NULL if either value is NULL).

`π`

*self***-***other*Return a series giving the difference between each date in this series and
`other`

(see `DateDifference`

).

`π`

**year**Return an integer series giving the year of each date in this series.

`π`

**month**Return an integer series giving the month (1-12) of each date in this series.

`π`

**day**Return an integer series giving the day of the month (1-31) of each date in this series.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**to_first_of_year**()Return a date series with each date in this series replaced by the date of the first day in its corresponding calendar year.

`π`

**to_first_of_month**()Return a date series with each date in this series replaced by the date of the first day in its corresponding calendar month.

`π`

**is_before**(*other*)Return a boolean series which is True for each date in this series that is
earlier than its corresponding date in `other`

and False otherwise (or NULL if
either value is NULL).

`π`

**is_on_or_before**(*other*)Return a boolean series which is True for each date in this series that is
earlier than or the same as its corresponding value in `other`

and False
otherwise (or NULL if either value is NULL).

`π`

**is_after**(*other*)Return a boolean series which is True for each date in this series that is later
than its corresponding date in `other`

and False otherwise (or NULL if either value
is NULL).

`π`

**is_on_or_after**(*other*)Return a boolean series which is True for each date in this series that is later
than or the same as its corresponding value in `other`

and False otherwise (or
NULL if either value is NULL).

`π`

**is_between_but_not_on**(*start*,*end*)Return a boolean series which is True for each date in this series which is
strictly between (i.e. not equal to) the corresponding dates in `start`

and `end`

.

`π`

**is_on_or_between**(*start*,*end*)Return a boolean series which is True for each date in this series which is
between or the same as the corresponding dates in `start`

and `end`

.

`π`

**is_during**(*interval*)The same as `is_on_or_between()`

above, but allows supplying a start/end date
pair as single argument.

####
*class* **DateEventSeries**()
π

*class*

**DateEventSeries**()

Multiple rows per patient series of type `date`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

*self***>***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***>=***other*`other`

and False otherwise (or NULL
if either value is NULL).

`π`

*self***<=***other*`other`

and False otherwise
(or NULL if either value is NULL).

`π`

*self***<***other*`other`

and False otherwise (or
NULL if either value is NULL).

`π`

*self***-***other*Return a series giving the difference between each date in this series and
`other`

(see `DateDifference`

).

`π`

**year**Return an integer series giving the year of each date in this series.

`π`

**month**Return an integer series giving the month (1-12) of each date in this series.

`π`

**day**Return an integer series giving the day of the month (1-31) of each date in this series.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**to_first_of_year**()Return a date series with each date in this series replaced by the date of the first day in its corresponding calendar year.

`π`

**to_first_of_month**()Return a date series with each date in this series replaced by the date of the first day in its corresponding calendar month.

`π`

**is_before**(*other*)Return a boolean series which is True for each date in this series that is
earlier than its corresponding date in `other`

and False otherwise (or NULL if
either value is NULL).

`π`

**is_on_or_before**(*other*)Return a boolean series which is True for each date in this series that is
earlier than or the same as its corresponding value in `other`

and False
otherwise (or NULL if either value is NULL).

`π`

**is_after**(*other*)Return a boolean series which is True for each date in this series that is later
than its corresponding date in `other`

and False otherwise (or NULL if either value
is NULL).

`π`

**is_on_or_after**(*other*)Return a boolean series which is True for each date in this series that is later
than or the same as its corresponding value in `other`

and False otherwise (or
NULL if either value is NULL).

`π`

**is_between_but_not_on**(*start*,*end*)Return a boolean series which is True for each date in this series which is
strictly between (i.e. not equal to) the corresponding dates in `start`

and `end`

.

`π`

**is_on_or_between**(*start*,*end*)Return a boolean series which is True for each date in this series which is
between or the same as the corresponding dates in `start`

and `end`

.

`π`

**is_during**(*interval*)The same as `is_on_or_between()`

above, but allows supplying a start/end date
pair as single argument.

`π`

**count_distinct_for_patient**()`π`

**minimum_for_patient**()Return the minimum value in the series for each patient (or NULL if the patient has no values).

`π`

**maximum_for_patient**()Return the maximum value in the series for each patient (or NULL if the patient has no values).

####
*class* **CodePatientSeries**()
π

*class*

**CodePatientSeries**()

One row per patient series of type `code`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**to_category**(*categorisation*,*default=None*)An alias for `map_values`

which makes the intention clearer when working with
codelists. See `codelist_from_csv()`

.

####
*class* **CodeEventSeries**()
π

*class*

**CodeEventSeries**()

Multiple rows per patient series of type `code`

`π`

*self***==***other*Return a boolean series comparing each value in this series with its
corresponding value in `other`

.

Note that the result of comparing anything with NULL (including NULL itself) is NULL.

`π`

*self***!=***other*Return the inverse of `==`

above.

Note that the same point regarding NULL applies here.

`π`

**is_null**()`π`

**is_not_null**()Return the inverse of `is_null()`

above.

`π`

**if_null_then**(*other*)Replace any NULL value in this series with the corresponding value in `other`

.

Note that `other`

must be of the same type as this series.

`π`

**is_in**(*other*)`other`

, where `other`

can be any of the standard "container"
types: tuple, list, set, frozenset, or dict.

`π`

**is_not_in**(*other*)Return the inverse of `is_in()`

above.

`π`

**map_values**(*mapping*,*default=None*)```
status = status_code.map_values(
{1: "pending", 2: "accepted", 3: "completed"},
default="unknown"
)
```

`π`

**to_category**(*categorisation*,*default=None*)An alias for `map_values`

which makes the intention clearer when working with
codelists. See `codelist_from_csv()`

.

`π`

**count_distinct_for_patient**()## Date Arithmeticπ

ehrQL supports adding and subtracting durations from dates e.g.
`date_of_admission + days(10)`

or `date_of_discharge - weeks(6)`

. It
also supports finding the difference between dates e.g.

```
days_in_hospital = (date_of_discharge - date_of_admission).days
```

When working with dates you should generally prefer using days or weeks as units, rather than months or years, unless there's a specific reason you care about the calendar. Adding or subtracting days or weeks to a date is a simple, unambiguous process; adding years or months introduces ambiguities and complexities as neither unit is a consistent length (see section below for details). So, for example, unless there is specific epidemiological significance to whether an event happened exactly three calendar months ago, it is better to say "90 days" rather than "3 months".

Additionally, some dates in the patient data (e.g. dates of birth) have been rounded to the first of the month for privacy purposes. It's pointless applying precise calendar arithmetic to such dates.

**Tip**: it can be clearer to readers to write time periods as a product
rather than just a number e.g. it's easier to see that `days(5 * 365)`

is approximately five years than it is with `days(1825)`

.

#### Ambiguous Datesπ

Adding years or months to a date sometimes has a clear answer e.g. 1 January 2001 is exactly one calendar year later than 1 January 2000, and 1 February is exactly one calendar month later. But not all cases are clear, for instance: which day is exactly one year after 29 February 2000? There is no 29 February 2001, so should it be 28 February or 1 March? And which day is exactly one calendar month after 31 August? There is no 31 September, so should it be 30 September or 1 October? Different databases give different answers here.

ehrQL takes the approach that we consistently round *up* to the next
date. That is, whenever the naΓ―ve calculation takes us to a date which
doesn't exist (like 29 February 2001 or 31 September) we always take the
next *largest* date (1 March 2001, or 1 October).

As different databases take different approaches here, ehrQL has to go to some lengths to ensure that we get the same results on every database we support. This introduces complexity into the generated SQL which may have performance costs in some circumstances. This is another reason to prefer unambiguous units of days and weeks where possible.

####
*class* **DateDifference**(*lhs*, *rhs*)
π

*class*

**DateDifference**(

*lhs*,

*rhs*)

Represents the difference between two date series (i.e. it is what you get when you subtract one date series from another)

`π`

**days**The value of the date difference in days (can be positive or negative)

`π`

**weeks**The value of the date difference in whole weeks (can be positive or negative)

`π`

**months**The value of the date difference in whole calendar months (can be positive or negative)

`π`

**years**The value of the date difference in whole calendar years (can be positive or negative)

####
*class* **days**(*value*)
π

*class*

**days**(

*value*)

Represents a duration of time specified in days

`π`

*self***==***other*Return a boolean indicating whether the two durations have the same value and units.

`π`

*self***!=***other*Return a boolean indicating whether the two durations do not have the same value and units.

`π`

*self***+***other*Add this duration to a date to produce a new date.

Alternatively two durations with the same units may be added to produce a new duration.

`π`

*self***-***other*Subtract another duration of the same units from this duration.

`π`

**-***self*Invert this duration so that rather that representing a movement, say, four weeks forwards in time it now represents a movement four weeks backwards.

`π`

**starting_on**(*date*)Return a list of time intervals covering the duration starting on the supplied date. For example:

```
weeks(3).starting_on("2000-01-01")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

`π`

**ending_on**(*date*)Return a list of time intervals covering the duration ending on the supplied date. For example:

```
weeks(3).ending_on("2000-01-21")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

####
*class* **months**(*value*)
π

*class*

**months**(

*value*)

Represents a duration of time specified in calendar months

`π`

*self***==***other*Return a boolean indicating whether the two durations have the same value and units.

`π`

*self***!=***other*Return a boolean indicating whether the two durations do not have the same value and units.

`π`

*self***+***other*Add this duration to a date to produce a new date.

Alternatively two durations with the same units may be added to produce a new duration.

`π`

*self***-***other*Subtract another duration of the same units from this duration.

`π`

**-***self*Invert this duration so that rather that representing a movement, say, four weeks forwards in time it now represents a movement four weeks backwards.

`π`

**starting_on**(*date*)Return a list of time intervals covering the duration starting on the supplied date. For example:

```
weeks(3).starting_on("2000-01-01")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

`π`

**ending_on**(*date*)Return a list of time intervals covering the duration ending on the supplied date. For example:

```
weeks(3).ending_on("2000-01-21")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

####
*class* **weeks**(*value*)
π

*class*

**weeks**(

*value*)

Represents a duration of time specified in weeks

`π`

*self***==***other*Return a boolean indicating whether the two durations have the same value and units.

`π`

*self***!=***other*Return a boolean indicating whether the two durations do not have the same value and units.

`π`

*self***+***other*Add this duration to a date to produce a new date.

Alternatively two durations with the same units may be added to produce a new duration.

`π`

*self***-***other*Subtract another duration of the same units from this duration.

`π`

**-***self*Invert this duration so that rather that representing a movement, say, four weeks forwards in time it now represents a movement four weeks backwards.

`π`

**starting_on**(*date*)Return a list of time intervals covering the duration starting on the supplied date. For example:

```
weeks(3).starting_on("2000-01-01")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

`π`

**ending_on**(*date*)Return a list of time intervals covering the duration ending on the supplied date. For example:

```
weeks(3).ending_on("2000-01-21")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

####
*class* **years**(*value*)
π

*class*

**years**(

*value*)

Represents a duration of time specified in calendar years

`π`

*self***==***other*Return a boolean indicating whether the two durations have the same value and units.

`π`

*self***!=***other*Return a boolean indicating whether the two durations do not have the same value and units.

`π`

*self***+***other*Add this duration to a date to produce a new date.

Alternatively two durations with the same units may be added to produce a new duration.

`π`

*self***-***other*Subtract another duration of the same units from this duration.

`π`

**-***self*`π`

**starting_on**(*date*)Return a list of time intervals covering the duration starting on the supplied date. For example:

```
weeks(3).starting_on("2000-01-01")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

`π`

**ending_on**(*date*)Return a list of time intervals covering the duration ending on the supplied date. For example:

```
weeks(3).ending_on("2000-01-21")
```

```
[
(date(2000, 1, 1), date(2000, 1, 7)),
(date(2000, 1, 8), date(2000, 1, 14)),
(date(2000, 1, 15), date(2000, 1, 21)),
]
```

Useful for generating the `intervals`

arguments to `Measures`

.

## Generalπ

####
**create_dataset**()
π

**create_dataset**()

A dataset defines the patients you want to include in your population and the variables you want to extract for them.

A dataset definition file must define a dataset called `dataset`

:

```
dataset = create_dataset()
```

Add variables to the dataset as attributes:

```
dataset.age = patients.age_on("2020-01-01")
```

####
*class* **Dataset**()
π

*class*

**Dataset**()

Create a dataset with `create_dataset`

.

`π`

**define_population**(*population_condition*)Define the condition that patients must meet to be included in the Dataset, in the form of a boolean patient series e.g.

```
dataset.define_population(patients.date_of_birth < "1990-01-01")
```

`π`

**configure_dummy_data**(*population_size*)Configure the dummy data to be generated.

```
dataset.configure_dummy_data(population_size=10000)
```

####
**case**(**when_thens*, *default=None*)
π

**case**(

**when_thens*,

*default=None*)

Take a sequence of condition-values of the form:

```
when(condition).then(value)
```

And evaluate them in order, returning the value of the first condition which
evaluates True. If no condition matches and a `default`

is specified then return
that, otherwise return NULL.

For example:

```
category = case(
when(size < 10).then("small"),
when(size < 20).then("medium"),
when(size >= 20).then("large"),
default="unknown",
)
```

Note that because the conditions are evaluated in order we don't need the condition
for "medium" to specify `(size >= 10) & (size < 20)`

because by the time the
condition for "medium" is being evaluated we already know the condition for "small"
is False.

A simpler form is available when there is a single condition. This example:

```
category = case(
when(size < 15).then("small"),
default="large",
)
```

can be rewritten as:

```
category = when(size < 15).then("small").otherwise("large")
```

####
**codelist_from_csv**(*filename*, *column*, *category_column=None*)
π

**codelist_from_csv**(

*filename*,

*column*,

*category_column=None*)

Read a codelist from a CSV file as either a list or a dictionary (for categorised codelists).

*filename*

Path to the file on disk, relative to the root of your repository. (Remember to use
UNIX/style/forward-slashes not Windows\style\backslashes.)

*column*

Name of the column in the CSV file which contains the codes.

*category_column*

Optional name of a column in the CSV file which contains categories to which each
code should be mapped. If this argument is passed then the resulting codelist will
be a dictionary mapping each code to its corresponding category. This can be passed
to the `to_category()`

method to map a series of
codes to a series of categories.

####
**maximum_of**(**args*)
π

**maximum_of**(

**args*)

Return the maximum value of a collection of Series or Values, disregarding NULLs

For example:

```
latest_event_date = maximum_of(event_series_1.date, event_series_2.date, "2001-01-01")
```

####
**minimum_of**(**args*)
π

**minimum_of**(

**args*)

Return the minimum value of a collection of Series or Values, disregarding NULLs

For example:

```
ealiest_event_date = minimum_of(event_series_1.date, event_series_2.date, "2001-01-01")
```

## Measuresπ

Measures are used for calculating the ratio of one quantity to another as it varies over time, and broken down by different demographic (or other) groupings.

####
**create_measures**()
π

**create_measures**()

A measure definition file must define a collection of measures called `measures`

.

```
measures = create_measures()
```

####
*class* **Measures**()
π

*class*

**Measures**()

Create a collection of measures with `create_measures`

.

`π`

**define_measure**(*name*,*numerator=None*,*denominator=None*,*group_by=None*,*intervals=None*)Add a measure to the list of measures to be generated.

*name*

The name of the measure, as a string.

*numerator*

The numerator definition, which must be a patient series but can be either
boolean or integer.

*denominator*

The denominator definition, which must be a patient series but can be either
boolean or integer.

*group_by*

Optional groupings to break down the results by. Must be supplied as a
dictionary of the form:

```
{
"group_name": group_definition,
...
}
```

*intervals*

A list of start/end date pairs over which to evaluate the measures. These can be
most conveniently generated using the `starting_on()`

/`ending_on()`

methods on
`years`

, `months`

, and `weeks`

e.g.

```
intervals = months(12).starting_on("2020-01-01")
```

The `numerator`

, `denominator`

and `intervals`

arguments can be omitted if
default values for them have been set using
`define_defaults()`

.

`π`

**define_defaults**(*numerator=None*,*denominator=None*,*group_by=None*,*intervals=None*)When defining several measures which share common arguments you can reduce repetition by defining default values for the measures.

Note that you can only define a single set of defaults and attempting to call this method more than once is an error.

####
**INTERVAL**
π

**INTERVAL**

This is a placeholder value to be used when defining numerator, denominator and group_by columns in a measure. This allows these definitions to be written once and then be automatically evaluated over multiple different intervals. It can be used just like any pair of dates in ehrQL e.g.

```
clincial_events.date.is_during(INTERVAL)
```

`π`

**start_date**Placeholder for the start date (inclusive) of the interval. Can be used like any other date e.g.

```
clinical_events.date.is_on_or_after(INTERVAL.start_date)
```

`π`

**end_date**Placeholder for the end date (inclusive) of the interval. Can be used like any other date e.g.

```
clinical_events.date.is_on_or_before(INTERVAL.end_date)
```