-
Detailed protocols are in the
protocol
folder. -
If you are interested in how we defined our code lists, look in the
codelists
folder. -
Analyses scripts are in the
analysis
directory:-
Dataset definition scripts are in the
dataset_definition
directory:- If you are interested in how we defined our variables, we use the variable script
variable_helper_fuctions
to define functions that generate variables. We then apply these functions invariables_cohorts
to create a dictionary of variables for cohort definitions, and invariables_dates
to create a dictionary of variables for calculating study start dates and end dates. - If you are interested in how we defined study dates (e.g., index and end dates), these vary by cohort and are described in the protocol. We use the script
dataset_definition_dates
to generate a dataset with all required dates for each cohort. This script imported all variables generated fromvariables_dates
. - If you are interested in how we defined our cohorts, we use the dataset definition script
dataset_definition_cohorts
to define a function that generates cohorts. This script imports all variables generated fromvariables_cohorts
using the patient's index date, the cohort start date and the cohort end date. This approach is used to generate three cohorts: pre-vaccination, vaccinated, and unvaccinated—found indataset_definition_prevax
,dataset_definition_vax
, anddataset_definition_unvax
, respectively. For each cohort, the extracted data is initially processed in the preprocess data scriptpreprocess data script
, which generates a flag variable for pre-existing respiratory conditions and restricts the data to relevant variables.
- If you are interested in how we defined our variables, we use the variable script
-
Dataset cleaning scripts are in the
dataset_clean
directory:- This directory also contains all the R scripts that process, describe, and analyse the extracted data.
dataset_clean
is the core script which executes all the other scripts in this folderfn-preprocess
is the function carrying out initial preprocessing, formatting columns correctlyfn-modify_dummy
is called from within fn-preprocess.R, and alters the proportions of dummy variables to better suit analysesfn-inex
is the inclusion/exclusion functionfn-qa
is the quality assurance functionfn-ref
is the function that sets the reference levels for factors
-
Modelling scripts are in the
model
directory:make_model_input.R
works with the output ofdataset_clean
to prepare suitable data subsets for Cox analysis. Combines each outcome and subgroup in one formatted .rds file.fn-prepare_model_input.R
is a companion function tomake_model_input.R
which handles the interaction withactive_analyses.rds
.cox-ipw
is a reusable action which uses the output ofmake_model_input.R
to fit a Cox model to the data.make_model_output.R
combines all the Cox results in one formatted .csv file.
-
-
The
active_analyses
contains a list of active analyses. -
The
project.yaml
defines run-order and dependencies for all the analysis scripts. This file should not be edited directly. To make changes to the yaml, edit and run thecreate_project_actions.R
script which generates all the actions. -
Descriptive and Model outputs, including figures and tables are in the
released_outputs
directory.
Outputs follow OpenSAFELY naming conventions related to suppression rules by adding the suffix "_midpoint6". The suffix "_midpoint6_derived" means that the value(s) are derived from the midpoint6 values. Detailed information regarding naming conventions can be found here.
The OpenSAFELY framework is a Trusted Research Environment (TRE) for electronic health records research in the NHS, with a focus on public accountability and research quality.
Read more at OpenSAFELY.org.
As standard, research projects have a MIT license.