Command line interface

ehrql [--help] [--version] COMMAND_NAME ...

The command line interface for ehrQL, a query language for electronic health record (EHR) data.

COMMAND_NAME 🔗

Name of the sub-command to execute.

generate-dataset

Take a dataset definition file and output a dataset.

generate-measures

Take a measures definition file and output measures.

dump-example-data

Dump example data for the ehrQL tutorial to the current directory.

dump-dataset-sql

Output the SQL that would be executed to fetch the results of the dataset definition.

create-dummy-tables

Generate dummy tables and write them out as files – one per table, CSV by default.

assure

Command for running assurance tests.

test-connection

Internal command for testing the database connection configuration.

serialize-definition

Internal command for serializing a definition file to a JSON representation.

isolation-report

Internal command for testing code isolation support.

graph-query

Output the dataset definition's query graph

debug

Internal command for getting debugging information from a dataset definition; used by the [OpenSAFELY VSCode extension][opensafely-vscode].

-h, --help 🔗

show this help message and exit

--version 🔗

Show the exact version of ehrQL in use and then exit.

generate-dataset 🔗

ehrql generate-dataset DEFINITION_FILE [--help] [--output OUTPUT_FILE]
      [--test-data-file TEST_DATA_FILE] [--dummy-data-file DUMMY_DATA_FILE]
      [--dummy-tables DUMMY_TABLES_PATH] [--dsn DSN]
      [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
      [ -- ... PARAMETERS ...]

Take a dataset definition file and output a dataset.

ehrQL is designed so that exactly the same command can be used to output a dummy dataset when run on your own computer and then output a real dataset when run inside the secure environment as part of an OpenSAFELY pipeline.

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

-h, --help 🔗

show this help message and exit

--output OUTPUT_FILE 🔗

Path of the file where the dataset will be written (console by default).

The file extension determines the file format used. Supported formats are: .arrow, .csv, .csv.gz

--test-data-file TEST_DATA_FILE 🔗

Takes a test dataset definition file.

--dummy-data-file DUMMY_DATA_FILE 🔗

Path to a dummy dataset.

This allows you to take complete control of the dummy dataset. ehrQL will ensure that the column names, types and categorical values match what they will be in the real dataset, but does no further validation.

Note that the dummy dataset doesn't need to be of the same type as the real dataset (e.g. you can use a .csv file here to produce a .arrow file).

This argument is ignored when running against real tables.

--dummy-tables DUMMY_TABLES_PATH 🔗

Path to directory of files (one per table) to use as dummy tables (see create-dummy-tables).

Files may be in any supported format: .arrow, .csv, .csv.gz

This argument is ignored when running against real tables.

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

Internal Arguments

You should not normally need to use these arguments: they are for the internal operation of ehrQL and the OpenSAFELY platform.

--dsn DSN 🔗

Data Source Name: URL of remote database, or path to data on disk (defaults to value of DATABASE_URL environment variable).

--query-engine QUERY_ENGINE_CLASS 🔗

Dotted import path to Query Engine class, or one of: mssql, sqlite, localfile, trino, csv

--backend BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

generate-measures 🔗

ehrql generate-measures DEFINITION_FILE [--help] [--output OUTPUT_FILE]
      [--dummy-data-file DUMMY_DATA_FILE] [--dummy-tables DUMMY_TABLES_PATH]
      [--dsn DSN] [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
      [ -- ... PARAMETERS ...]

Take a measures definition file and output measures.

DEFINITION_FILE 🔗

Path of the Python file where measures are defined.

-h, --help 🔗

show this help message and exit

--output OUTPUT_FILE 🔗

Path where measure output will be written (console by default), supported formats: .arrow, .csv, .csv.gz

Specify a single file to get data for all measures combined together e.g. --output results/measures.arrow

Specify a directory to get each measure in a separate file e.g. --output results/measures/:arrow

--dummy-data-file DUMMY_DATA_FILE 🔗

Path to dummy measures output.

This allows you to take complete control of the dummy measures output. ehrQL will ensure that the column names, types and categorical values match what they will be in the real measures output, but does no further validation.

Note that the dummy measures output doesn't need to be of the same type as the real measures output (e.g. you can use a .csv file here to produce a .arrow file).

You can either supply a single file containing data for all the measures combined, or a directory of individual files – one for each measure.

This argument is ignored when running against real tables.

--dummy-tables DUMMY_TABLES_PATH 🔗

Path to directory of files (one per table) to use as dummy tables (see create-dummy-tables).

Files may be in any supported format: .arrow, .csv, .csv.gz

This argument is ignored when running against real tables.

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

Internal Arguments

You should not normally need to use these arguments: they are for the internal operation of ehrQL and the OpenSAFELY platform.

--dsn DSN 🔗

Data Source Name: URL of remote database, or path to data on disk (defaults to value of DATABASE_URL environment variable).

--query-engine QUERY_ENGINE_CLASS 🔗

Dotted import path to Query Engine class, or one of: mssql, sqlite, localfile, trino, csv

--backend BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

dump-example-data 🔗

ehrql dump-example-data [--help] [--dst-dir DST_DIR]

Dump example data for the ehrQL tutorial to the current directory.

-h, --help 🔗

show this help message and exit

-d, --dst-dir DST_DIR 🔗

Destination folder ('example-data' by default)

dump-dataset-sql 🔗

ehrql dump-dataset-sql DEFINITION_FILE [--help] [--output OUTPUT_FILE]
      [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
      [ -- ... PARAMETERS ...]

Output the SQL that would be executed to fetch the results of the dataset definition.

By default, this command will output SQL suitable for the SQLite database. To get the SQL as it would be run against the real tables you will to supply the appropriate --backend argument, for example --backend tpp.

Note that due to configuration differences this may not always exactly match what gets run against the real tables.

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

-h, --help 🔗

show this help message and exit

--output OUTPUT_FILE 🔗

SQL output file (outputs to console by default).

--query-engine QUERY_ENGINE_CLASS 🔗

Dotted import path to Query Engine class, or one of: mssql, sqlite, localfile, trino, csv

--backend BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

create-dummy-tables 🔗

ehrql create-dummy-tables DEFINITION_FILE [DUMMY_TABLES_PATH] [--help]
      [ -- ... PARAMETERS ...]

Generate dummy tables and write them out as files – one per table, CSV by default.

This command generates the same dummy tables that the generate-dataset command would generate, but instead of using them to produce a dummy dataset, it writes them out as individual files.

The directory containing these files can then be used as the --dummy-tables argument to generate-dataset to produce the dummy dataset.

The files can be edited in any way you wish, giving you full control over the dummy tables.

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

DUMMY_TABLES_PATH 🔗

Path to directory where files (one per table) will be written.

By default these will be CSV files. To generate files in other formats add :<format> to the directory name e.g. my_outputs:arrow, my_outputs:csv, my_outputs:csv.gz

-h, --help 🔗

show this help message and exit

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

assure 🔗

ehrql assure TEST_DATA_FILE [--help] [ -- ... PARAMETERS ...]

Command for running assurance tests.

TEST_DATA_FILE 🔗

Path of the file where the test data is defined.

-h, --help 🔗

show this help message and exit

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

test-connection 🔗

ehrql test-connection [--help] [-b BACKEND_CLASS] [-u URL]

Internal command for testing the database connection configuration.

Note that this in an internal command and not intended for end users.

-h, --help 🔗

show this help message and exit

--backend, -b BACKEND_CLASS 🔗

Dotted import path to Backend class, or one of: emis, tpp

--url, -u URL 🔗

Database connection string.

serialize-definition 🔗

ehrql serialize-definition DEFINITION_FILE [--help]
      [--definition-type DEFINITION_TYPE] [--output OUTPUT_FILE]
      [--dummy-tables DUMMY_TABLES_PATH] [--display-format RENDER_FORMAT]
      [ -- ... PARAMETERS ...]

Internal command for serializing a definition file to a JSON representation.

Note that this in an internal command and not intended for end users.

DEFINITION_FILE 🔗

Definition file path

-h, --help 🔗

show this help message and exit

-t, --definition-type DEFINITION_TYPE 🔗

Options: dataset, measures, test, debug

-o, --output OUTPUT_FILE 🔗

Output file path (stdout by default)

--dummy-tables DUMMY_TABLES_PATH 🔗

Path to directory of files (one per table) to use as dummy tables (see create-dummy-tables).

Files may be in any supported format: .arrow, .csv, .csv.gz

This argument is ignored when running against real tables.

--display-format RENDER_FORMAT 🔗

Render format for debug command, default ascii

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

isolation-report 🔗

ehrql isolation-report [--help]

Internal command for testing code isolation support.

Note that this in an internal command and not intended for end users.

-h, --help 🔗

show this help message and exit

graph-query 🔗

ehrql graph-query DEFINITION_FILE [--help] OUTPUT_FILE [ -- ... PARAMETERS ...]

Output the dataset definition's query graph

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

-h, --help 🔗

show this help message and exit

OUTPUT_FILE 🔗

SVG output file.

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.

debug 🔗

ehrql debug DEFINITION_FILE [--help] [--dummy-tables DUMMY_TABLES_PATH]
      [--display-format RENDER_FORMAT] [ -- ... PARAMETERS ...]

Internal command for getting debugging information from a dataset definition; used by the OpenSAFELY VSCode extension.

Note that this in an internal command and not intended for end users.

DEFINITION_FILE 🔗

Path of the Python file where the dataset is defined.

-h, --help 🔗

show this help message and exit

--dummy-tables DUMMY_TABLES_PATH 🔗

Path to directory of files (one per table) to use as dummy tables (see create-dummy-tables).

Files may be in any supported format: .arrow, .csv, .csv.gz

--display-format RENDER_FORMAT 🔗

Render format for debug command, default ascii

PARAMETERS 🔗

Parameters are extra arguments you can pass to your Python definition file. They must be supplied after all ehrQL arguments and separated from the ehrQL arguments with a double-dash --.