Skip to content

A scikit-learn pipeline that predicts survival after pediatric bone-marrow transplant. It auto-cleans 37 clinical variables, splits numeric vs low-cardinality categoricals, scales, one-hot encodes, PCA-reduces, and tunes a logistic-regression classifier—landing at 76 % test accuracy with only 30 components.

Notifications You must be signed in to change notification settings

AnastasChoudra/Bone-Marow-Transplants

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bone-Marrow Transplant (Children) – Survival Classification

Dataset
UCI “Bone Marrow Transplant Children”
The data set describes pediatric patients with several hematologic diseases, who were subject to the unmanipulated allogeneic unrelated donor hematopoietic stem cell transplantation. 37 clinical variables, 187 patients, target: survival_status (0 = alive, 1 = dead).

Project
End-to-end scikit-learn Pipeline that

  • imputes & one-hot-encodes categorical vars (≤ 7 unique values)
  • imputes & standard-scales numeric vars (> 7 unique values)
  • reduces dimensionality with PCA
  • classifies with Logistic Regression + hyper-parameter tuning (C, PCA components)

Results
Best test accuracy: 76.3 %
Chosen PCA components: 30
Logistic Regression: C = 100
(5-fold GridSearchCV, random-state = 1, 80/20 train-test split)

About

A scikit-learn pipeline that predicts survival after pediatric bone-marrow transplant. It auto-cleans 37 clinical variables, splits numeric vs low-cardinality categoricals, scales, one-hot encodes, PCA-reduces, and tunes a logistic-regression classifier—landing at 76 % test accuracy with only 30 components.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages