Skip to content

Latest commit

 

History

History
117 lines (73 loc) · 3.51 KB

File metadata and controls

117 lines (73 loc) · 3.51 KB

mit-tmle-glucose

Use of Causal Inference to find optimal glucose range in septic ICU patients

How to run this project?

1. Clone this repository

Run the following command in your terminal.

git clone https://github.com/mirkompcr/mit-tmle-glucose

2. Install required Packages

R scripts Run the following command in R:

source('setup/install_packages.R')

Python scripts Run the following command in you terminal:

pip3 install -r setup/requirements_py.txt

3. Get the Data!

MIMIC data can be found in PhysioNet, a repository of freely-available medical research data, managed by the MIT Laboratory for Computational Physiology. Due to its sensitive nature, credentialing is required.

Documentation for MIMIC-IV's can be found here.

Integration with Google Cloud Platform (GCP)

In this section, we explain how to set up GCP and your environment in order to run SQL queries through GCP right from your local Python setting. Follow these steps:

  1. Create a Google account if you don't have one and go to Google Cloud Platform
  2. Enable the BigQuery API
  3. Create a Service Account, where you can download your JSON keys
  4. Place your JSON keys in the parent folder (for example) of your project
  5. Create a .env file with the command nano .env or touch .env for Mac and Linux users or echo. > .env for Windows.
  6. Update your .env file with your JSON keys path and the id of your project in BigQuery

Follow the format:

KEYS_FILE = "../GoogleCloud_keys.json"
PROJECT_ID = "project-id"

MIMIC-IV

After getting credentialing at PhysioNet, you must sign the data use agreement and connect the database with GCP, either asking for permission or uploading the data to your project.

Make sure you copy all required files and folders from physionet-data into your project.

Run all SQL script sequentiall from 1 to 7. Then run the following command to fetch the aggregated data as a dataframe that will be saved as CSV in your local project:

python3 src/2_cohorts/1_get_data.py --sql "src/1_sql/8_MIMIC_gluc_agg.sql" --destination "data/MIMIC_agg.csv"

4. Get the Cohorts

4.1 Get the cohorts ready for analysis

With the following command, you can get the same cohorts we used for the study. Run the commands in your terminal:

MIMIC-IV

python3 src/2_cohorts/2_MIMIC_cohort.py

This will create the files data/cohorts/MIMIC_agg.csv.

4.2 Get a merged dataframe ready

Run the command in you R console:

source("src/2_cohorts/3_load_data.R")

This will create the files data/cohorts/MIMIC_agg_cleaned.csv and will also upload it to your personal GCP project.

Now, you can run this command to fetch the hourly data as a dataframe/CSV:

python3 src/2_cohorts/1_get_data.py --sql "src/1_sql/8_MIMIC_gluc_hourly.sql" --destination "data/cohorts/MIMIC_hr.csv"

5. Analyses

5.1 Run the TMLE analysis

We made it really easy for you in this part. All you have to do is:

source("src/4_tmle/tmle3_shift_parall.R")

And you'll get the resulting ATEs here: results/tmle/

5.2 Run the joint model analysis

We made it really easy for you in this part. All you have to do is:

source("src/5_joint_modelling/joint_model_vs.R")

And you'll get the resulting coefficients here: results/JM/