Skip to content

HealthPredict is a proof-of-concept project demonstrating the application of Supervised Machine Learning to address the United Nations Sustainable Development Goal (SDG) 3: Good Health and Well-being, specifically Target 3.4 – Non-Communicable Diseases (NCDs).

License

Notifications You must be signed in to change notification settings

software-development-course-2025/ai-se-w02-core-concepts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HealthPredict Banner

🌍 HealthPredict: AI for Diabetes Risk Prediction (SDG 3)

Repository: ai-se-w02-core-concepts

The project implements a binary classification model (Logistic Regression) to predict the likelihood of Type 2 Diabetes using patient biometric data. This work illustrates key Week 2 learning objectives of the “AI for Software Engineering” specialization (Supervised Learning, Model Evaluation, Ethical Reflection).


🎯 Problem & SDG Alignment

Early detection of diabetes plays a vital role in prevention and treatment.
Our AI model acts as a low-cost, high-scale screening tool to identify high-risk patients in resource-limited settings, directly contributing to global health improvement and reducing healthcare system burdens.

🤖 Machine Learning Approach

Component Detail
Dataset Pima Indians Diabetes Dataset (diabetes.csv).
Approach Binary supervised classification (Diabetic / Non-Diabetic).
Model Logistic Regression (LogisticRegression).
Tech Stack Python, Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn.

📂 Repository Structure

ai-se-w02-core-concepts/
├── src/
│   └── sdg3_health_predictor.py                  # Main Python ML script
├── data/
│   └── diabetes.csv                              # Project dataset (Input)
├── docs/
│   ├── SDG3_Report_Final.md                      # Summary Report
│   └── SDG3_HealthPredict_PitchDeck_2025.pdf     # Final Pitch Deck
├── assets/
│   ├── banner.png                                # Cover
│   └── confusion_matrix.png                      # Model evaluation plot
└── README.md                                     # Repository entry point (This file)

⚙️ Setup and Execution

1. Clone the Repository

git clone https://github.com/software-development-course-2025/ai-se-w02-core-concepts
cd ai-se-w02-core-concepts

2. Move Data File

Place the diabetes.csv file inside the data/ folder.

3. Install Dependencies

pip install pandas numpy scikit-learn matplotlib seaborn

4. Run the ML Script

python src/sdg3_health_predictor.py

The script will print the evaluation metrics and display the Confusion Matrix plot.

📊 Model Results (Unseen Data)

The model was trained on 80% and evaluated on 20% of the data.

Metric Score (Example Output)
Accuracy 0.7825
Precision 0.7011
Recall 0.6180
F1-Score 0.6572

Confusion Matrix

The visualization below confirms the model's classification performance:

Confusion Matrix

💡 Ethical Reflection and Social Impact

The project highlights:

  • Risks of data bias, as the dataset represents a limited population.
  • Importance of equity auditing to ensure fair performance across diverse demographic groups.
  • Necessity of responsible AI practices, transparency, and careful deployment in clinical environments.

These considerations align with SDG 3 by promoting safe, ethical, and equitable health technology development.


ℹ️ Author & License

👤 Author: Augusto Mate
📧 Email: [email protected]
🐙 GitHub
🔗 LinkedIn

This project is licensed under the MIT License.
See the LICENSE file for details.


In every dataset lies a story, and in every prediction, a chance to change one.
HealthPredict is a small step toward a future where care begins with foresight.

About

HealthPredict is a proof-of-concept project demonstrating the application of Supervised Machine Learning to address the United Nations Sustainable Development Goal (SDG) 3: Good Health and Well-being, specifically Target 3.4 – Non-Communicable Diseases (NCDs).

Topics

Resources

License

Stars

Watchers

Forks

Languages