Skip to content

Clinical ML study with EDA, logistic regression for mortality prediction, K-Means risk cohorts, t-tests, and evaluation (accuracy/recall/F1/Kappa).

License

Notifications You must be signed in to change notification settings

a7madv4d2/Heart-Failure-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

❤️ Heart Failure Risk Prediction and Cluster Analysis

📊 Project Overview:

This project leverages machine learning and statistical analysis to predict heart failure outcomes and segment patients into risk categories. By analyzing clinical data, the goal is to identify key features associated with mortality and provide actionable insights for early intervention and patient management.


🔍 Objectives:

  • Predict heart failure mortality using logistic regression.
  • Segment patients into risk clusters using KMeans clustering.
  • Identify key clinical features contributing to heart failure outcomes.
  • Evaluate model performance using accuracy, recall, F1-score, and Cohen’s Kappa.

🚀 Key Features:

  • Exploratory Data Analysis (EDA):
    • Visualizations (pair plots, heatmaps) to explore relationships between variables.
  • Machine Learning Models:
    • Logistic Regression for mortality prediction.
    • KMeans clustering to group patients by risk level.
  • Statistical Testing:
    • Independent t-tests to identify significant differences between survivors and non-survivors.
    • Cohen’s Kappa to measure agreement between cluster predictions and true labels.
  • Visualization:
    • Pair plots to visualize clustering across critical features.
    • ROC Curve and confusion matrices for model evaluation.

📊 Results:

  • Logistic Regression Accuracy: 82%
  • Recall for Mortality Cases (Class 1): 85%
  • Cohen’s Kappa (Cluster Agreement): 0.44 (Moderate Agreement)

📈 Key Findings:

  • Serum Creatinine and Ejection Fraction are strong predictors of heart failure mortality.
  • Time (follow-up period) correlates with patient survival, with shorter follow-up linked to higher mortality.
  • Cluster analysis reveals two distinct patient groups, representing high and low-risk categories.

🛠️ How to Run the Project:

  1. Clone the Repository:
    git clone https://github.com/a7madv4d2/Heart-Failure-Analysis
    cd Heart-Failure-Analysis
  2. Run Jupyter Notebooks:
    jupyter notebook
  3. Explore the Notebooks in the notebooks directory for full analysis and visualizations.

📌 Technologies Used:

  • Python
  • Pandas, NumPy
  • Scikit-Learn
  • Seaborn, Matplotlib
  • Jupyter Notebook

🔗 Project Link:

👉 GitHub Repository: Heart Failure Analysis Project


📧 Contact and Collaboration:

Feel free to reach out for collaborations or discussions on how data science can drive better healthcare solutions.
🌐 LinkedInwww.linkedin.com/in/ahmed-elsayed-2a8208239
📩 Email[email protected]


If you find this project helpful, please consider giving it a star!

About

Clinical ML study with EDA, logistic regression for mortality prediction, K-Means risk cohorts, t-tests, and evaluation (accuracy/recall/F1/Kappa).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published