- Goal: Predict diabetes diagnosis (binary classification) and patient age (regression) using health-related factors.
- Dataset: Subset of a CDC dataset
- Smaller, cleaned version available within the UCI Machine Learning Repo
- License: CC0 Public Domain
- Models:
- Regression Models:
- Linear regression (baseline)
- Feedforward Neural Network
- Classification Models:
- Feedforward Neural Network
- Regression Models:
- Results:
- Regression: The FNN outperformed the linear regression baseline when predicting age, achieving lower MSE scores on the testing set.
- Classification: The FNN achieved an accuracy of 76% and decently balanced performance across both classes (F1-score of 0.77 for class 1 and an F1-score of 0.74 for class 0).
- Goal: Classify medical question-answer pairs into focus areas (multiclass classification) from trusted medical sources.
- Dataset: MedQuAD dataset from the research paper A Question-Entailment Approach to Question Answering
- License: CC BY 4.0
- Models:
- Feedforward Neural Network
- Specialized BERT Transformer (BiomedBERT)
- Results:
- The FNN achieved ~83% accuracy on the testing set
- The specialized BERT achieved ~98% accuracy on the testing set, a more balanced performance across all five classes (higher precision, recall, and F1-scores)
- The specialized BERT demonstrated better understanding of medical text
- Goal: Classify high-quality retinal fundus images for diabetes retinopathy (binary classification).
- Dataset: IDRiD dataset
- License: CC BY 4.0
- Models:
- Convolutional Neural Network
- Vision Transformer
- Results:
- The CNN achieved ~58% accuracy on the testing set
- The ViT achieved ~82% accuracy on the testing set
- The ViT showed better performance across both classes (higher precision, recall, and F1-scores)
To explore and run the code for this capstone project, we suggest creating an environment containing the required list of libraries and their versions, which is included in the requirements.txt
file.
The notebooks can run locally or on the cloud using Jupyter-compatible environments (recommended) like:
- Google Colab
- Kaggle Notebooks
- Paperspace Gradient
- Deepnote
💡 Tip: Enabling the GPU is highly recommended, especially for the image classification task, which involves classifying retinal fundus images.
All required libraries and their versions can be installed using:
pip install -r requirements.txt