Install pixi to install the dependencies necessary to run the project.
Once pixi is installed, clone the repository and run the following inside the project directory:
# Install all pixi-friendly dependencies
pixi install
# Install missing R dependencies
pixi run post_installTo run the python notebooks, it is recommended to install cuda 12.6. After installation, verify GPU availability in PyTorch:
import torch
print(torch.cuda.is_available())All main scripts can be found in the pipeline directory:
| Script | Description | Command |
|---|---|---|
preprocessing.R |
Initial pre-processing of Electronic Health Records consisting of early warning score measurements and vital signs for individuals residing in Denmark, with a general admission to the hospitals in the region of Zealand, Denmark, between 2018-2023. | pixi run preprocessing |
extract_metadata.R |
Addition of other clinical data, consisting of procedures, diagnoses, blood tests, and ITA information. | pixi run extract_metadata |
extract_embeddings.py |
Addition of text embeddings from the metadata using static embeddings. | pixi run extract_embeddings |
analysis_main.R |
Comparison of various models and algorithms for early warning systems: • Implementation of the weighting model (CBPS) for the individuals • 🔗 NEWS (National Early Warning Score) • 🔗 Simplified NEWS: NEWS2 - Blood Pressure - Temperature • 🔗 DEWS (Demographic Early Warning Score): Simplified NEWS + Age + Sex • 🔗 XGB-EWS: Age + Sex + Vital Signs + Number of Previous Hospitalizations + Embeddings of Previous Medical Procedures and Diagnoses + historical averages of blood test values + time-related recording information • Grouped Cross-Validation based on hospitals • AUC, Brier Score, Calibration, Net Benefit (Differences) |
pixi run analysis_main |
analysis_composite_outcome.R |
Analysis of composite outcomes (ICU + Death). | pixi run analysis_composite |
- Static embeddings of medical procedures/diagnoses trajectories using model2vec's potion-multilingual-128M model
- Logistic regression for Covariate Balancing Propensity Score (CBPS) using the weightit R package
-
Assessment of NEWS current system based on predictive performance metrics using data-splitting techniques ✅.
-
De-biasing the dataset with IPW (Inverse Probability Weighting) based on intervention scenarios ✅
-
Development of alternative early warning score systems and model comparison ✅
-
Outcome: 24-hour mortality prediction after initial NEWS score ✅
-
Used scores: Initial score at admission ✅
-
Assess calibration and net benefit on various strata of target population ✅