Skip to content

Seldarzu/python-ml-practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


❀️ Heart Disease Classification

Machine Learning pipeline for predicting heart disease risk using clinical features and statistical modeling techniques.

This project focuses on data distribution analysis, preprocessing, and classification performance evaluation, combining exploratory analysis with supervised learning.


🧠 Project Overview

The objective of this project is to analyze cardiovascular health indicators and build a predictive model capable of identifying heart disease presence.

The workflow includes:

βœ”οΈ Numerical feature distribution analysis βœ”οΈ Data preprocessing & feature preparation βœ”οΈ Logistic Regression modeling βœ”οΈ Performance evaluation using confusion matrix


πŸ“‚ Repository Structure

heart_disease_classification/
β”‚
β”œβ”€β”€ heart_disase_classification.ipynb
β”‚
β”‚
└── README.md

πŸ“Š Key Visual Insights

πŸ“ˆ Feature Distributions

Understanding the distribution of medical features is critical before training predictive models.

Observed Patterns:

  • Age and Max Heart Rate follow near-normal distributions.
  • Cholesterol shows wider variance and potential outliers.
  • Oldpeak is heavily right-skewed, indicating potential scaling considerations.

πŸ€– Model Evaluation β€” Logistic Regression

The confusion matrix below shows the performance of the baseline classification model.

Interpretation:

  • The model correctly identifies a strong portion of positive heart disease cases.
  • Some false positives and false negatives remain, suggesting room for improvement with advanced models.

πŸ› οΈ Tech Stack

  • Python
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Scikit-learn
  • Jupyter Notebook

πŸš€ How to Run

git clone https://github.com/your-username/heart_disease_classification.git
cd heart_disease_classification

Install dependencies:

pip install pandas numpy matplotlib seaborn scikit-learn

Run:

heart_disase_classification.ipynb

πŸ“ˆ Future Improvements

  • Feature scaling experiments
  • Hyperparameter tuning
  • Tree-based models (Random Forest / XGBoost)
  • ROC-AUC & Precision-Recall analysis

πŸ‘©β€πŸ’» Author

Arzu Selda AvcΔ± Computer Engineering β€” Final Year Data Science & AI Enthusiast


About

I got a course on Udemy for Python and Machine Learning. I will practice some lessons and publish them here.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages