Modeling of trajectories of routine blood values as dynamic biomarkers in spinal cord injury

Repo that reproduce the modeling of routine blood trajectories in SCI from EHR data study performed by the Health.data DRIVEN lab at the School of Public Health Sciences, University of Waterloo. Code created by Drs. Marzieh Mussavi Rizi and Abel Torres Espin.

For detail, please see our publications

Peer-reviewed publication

Mussavi Rizi, M., Fernández, D., Kramer, J.L.K. et al. Modeling trajectories of routine blood tests as dynamic biomarkers for outcome in spinal cord injury. npj Digit. Med. 8, 470 (2025). https://doi.org/10.1038/s41746-025-01782-0

Pre-Print

Modeling trajectories of routine blood tests as dynamic biomarkers for outcome in spinal cord injury Marzieh Mussavi Rizi, Daniel Fernandez, John LK Kramer, Rajiv Saigal, Anthony M. DiGiorgio, Michael S. Beattie, Adam R Ferguson, Nikos Kyritsis, Abel Torres-Espin, TRACK-SCI investigators medRxiv 2025.01.20.25320728; doi: https://doi.org/10.1101/2025.01.20.2532072

Repo structure

blood_trajectory_analysis.Rmd: contains all the main code to reproduce our modeling and analysis
functions.R: script with the main custom functions. It is loaded to environment by blood_trajectory_analysis.Rmd
models_GMM: set of GMM fited models using the lcmm package. These are .Rds files the with model objects
models_GMM_selected: set of final selected GMM fited models using the lcmm package. These are .Rds files the with model objects
prediction_experiments: set of .Rds files with R objects (lists and dataframes) containing all information on prediction experiments (see publications)
figures: output folder for figures
tables: output folder for tables

Notes on reproducing this work

Datasets

We do not provide the datasets directly, and users of this code will need to download the data. Three datasets are used in this work: MIMIC-III version 1.4, MIMIC-IV version 1.0, and a subset of the TRACK-SCI cohort study. The .Rmd script contains the necessary code to prepare the data for analysis.

MIMIC

Data has been download from PhysioNet. Both MIMIC databases (DB) are relational DB structured in tables. Documentation about the DB schema can be found here.

Note that data access need Data Use Agreement with PhysioNet. No data is provided in this document or repository. The code would not run without the data!

TRACK-SCI dataset

The necessary TRACK-SCI data can be downloaded from the Open Data Commons for Spinal Cord Injury (SCI) here. If you use the data, please cite:

Mussavi Rizi, M., Saigal, R., DiGiorgio, A. M., Ferguson, A. R., Beattie, M. S., Kyritsis, N., Torres Espin, A.. 2025. Blood laboratory values from 137 de-identified TRACK-SCI participants from routine collected real-world data. Open Data Commons for Spinal Cord Injury. ODC-SCI:1345. doi: 10.34945/F5PK6X

SAPSII

Part of this work uses SAPS II values for both MIMIC datasets. If you want to reproduce our work using this code, you will need to calculate it first, and save it in a mimic_SC_saps.csv file that contains four columns: subject_id = subject identifier; hadm_id = hospital admision identifier; icustay_id = ICU stay identifier; sapsii = calculated SAPS II.

For MIMIC-III, we compute SAPS II scores for the selected cohort using SQL code publicly available on GitHub. (https://github.com/MIT-LCP/mimic-code/blob/main/mimic-iii/concepts/severityscores/sapsii.sql). For MIMIC-IV, we used the equivalent script (https://github.com/MIT-LCP/mimic-code/blob/main/mimic-iv/concepts/score/sapsii.sql).

Dependencies

The code should run with the following environment. Further information can be found in the .Rmd file.

"R version 4.4.1 (2024-06-14 ucrt)", "tidyverse 2.0.0", "data.table 1.17.0", "stringr 1.5.1", "DT 0.33", "gtsummary 2.2.0", "lcmm 1.9.4", "caret 7.0-1", "yardstick 1.3.2", "patchwork 1.3.0", "parallel (base with R 4.4.1)"

Running the code

Some sections of the .Rmd script are not evaluated during knitting (rendering) due to their computational overhead. We have provided intermediate files containing the necessary objects and biproducts of the code, including the final trajectory models to facilitate reproducibility. By cloning this repo, you should be able to reproduce our results without having to re-fit all the models, but you can do so too. To reproduce this work, you will need to run the code on the IDE by chunk.

Running the full script from scratch will override some of the provided files and it can take hours to days to complete.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
figures		figures
models_GMM		models_GMM
models_GMM_selected		models_GMM_selected
prediction_experiments		prediction_experiments
tables		tables
.gitignore		.gitignore
GMM_param.csv		GMM_param.csv
LICENSE		LICENSE
README.md		README.md
Repo.Rproj		Repo.Rproj
blood_trajectory_analysis.Rmd		blood_trajectory_analysis.Rmd
blood_trajectory_analysis.html		blood_trajectory_analysis.html
functions.R		functions.R
icd10_codes_sci.csv		icd10_codes_sci.csv
icd9_codes_sci.csv		icd9_codes_sci.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modeling of trajectories of routine blood values as dynamic biomarkers in spinal cord injury

Repo structure

Notes on reproducing this work

Datasets

MIMIC

TRACK-SCI dataset

SAPSII

Dependencies

Running the code

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Modeling of trajectories of routine blood values as dynamic biomarkers in spinal cord injury

Repo structure

Notes on reproducing this work

Datasets

MIMIC

TRACK-SCI dataset

SAPSII

Dependencies

Running the code

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages