This project is a deep learning-based solution for classifying eye images into two categories: Cataract and Normal. It includes data preprocessing, model training using transfer learning, evaluation, and deployment using FastAPI and Streamlit.
├── data/preproceseed_images
│ ├── test/ # Test images data
| | └─cataract/
| | └─normal/
│ ├── train/ # Train images data
| | └─cataract/
| | └─normal/
├── notebooks/
│ ├── datapreprocess.ipynb # EDA and data preprocessing
│ └── evaluation_test.ipynb # Model evaluation notebook
├── src/
│ ├── preprocessing.py # Data augmentation setup
│ └── model.py # Model training script
│ └── ml_requirements.txt # Model training script
├── models/
│ └── best_model_vgg16_v3.h5 # Trained VGG-16 model
├── api/
│ ├── main.py # FastAPI app
│ └── streamlit_app.py # Streamlit frontend
│ └── requirements.txt # Streamlit frontend
│ └── sample_images/ # Sample test images
notebooks/data_preprocess.ipynb- Training Set: 491 images (
cataract,normal) - Testing Set: 121 images (
cataract,normal) - Balanced across both classes, but relatively small for typical deep learning.
- Image Thresholding to segment foreground and background
- Edge detection to highlight medical features
Applied with ImageDataGenerator:
- Rotation (±10°), shifts (10%), zoom (±10%)
- Brightness adjustment (±15%)
- Horizontal flip (valid in medical context)
- Rescale pixel values to
[0,1] - Train validation split ratio - 0.2
src/model.py- Base Model: VGG16 with transfer learning
- Fine-tuning: Last few layers, 10 additional epochs
- Optimizer: Adam, LR = 0.0001
- Callbacks: Early stopping, model checkpoint
- Label smoothing for handling noisy labels
Why VGG16?
- Pre-trained on ImageNet, has 138M params in total.
- Useful in medical use cases where data is limited or hard to label.
- Helps it learn low-level features like edges, shapes, and textures, which are often transferable to medical imaging (e.g. eye textures, cloudiness in cataract cases).
- VGG16, when used with transfer learning + data augmentation, performs well even with limited data.
To download the V6G-16 model trained, so as to run for the API:
- Visit the Google Drive link
- filename - best_model_vgg16_v3.h5
- Make sure to store it in models/ folder path for utilisation.
notebooks/evaluation_test.ipynb- Test Accuracy: 99.95%
- Precision: 92.19%
- Recall: 98.33%
- AUC: 95.04%
- Optimal Threshold:
0.65- F1-score:
0.9672 - Precision:
0.9516 - Recall:
0.9833
- F1-score:
| Class | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| Cataract | 0.98 | 0.95 | 0.97 | 61 |
| Normal | 0.95 | 0.98 | 0.97 | 60 |
| Accuracy | 0.97 | 121 | ||
| Macro Avg | 0.97 | 0.97 | 0.97 | 121 |
| Weighted Avg | 0.97 | 0.97 | 0.97 | 121 |
-
Confusion Matrix: Very low false positive/negative rates, which is especially important in medical diagnostics. Missing a cataract could be risky, so a high recall is good.
-
ROC Curve: The orange curve nearly touches the top-left corner, which means the model is perfectly distinguishing between the classes.
api/main.py- Loads model
models/best_model_vgg16_v3.h5 - Preprocesses image (resize to 224x224, normalize)
- API Endpoint:
POST /predict/ - Returns:
{ "prediction": "Cataract" or "Normal", "confidence": 97.98 }
curl -X 'POST' \
'http://127.0.0.1:8000/predict/' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@image_253.png;type=image/png'api/streamlit_app.py- Upload image or test with sample images
- Interacts with FastAPI backend
- Displays prediction label and confidence %
cd api/
pip install -r requirements.txtuvicorn main:app --reload- Visit: http://127.0.0.1:8000/docs
- Click on the POST /predict/ endpoint.
- Click "Try it out".
- Use the "Choose File" button to upload your image (.jpg, .png, etc.).
- Click "Execute" to get the prediction with confidence.
On another terminal,
streamlit run streamlit_app.py- Visit: http://localhost:8501
- Click on sample image ‘Predict’ button to see results.
- Or, upload an image of your own from app/samples/ folder, then click ‘Predict Uploaded image’ to see the results - Predicted Class with Confidence%.
- Medical Imaging - Thresholding Techniques
- To know more in detail regarding this project, you can view the Binary Cataract Classification Documentation.pdf attached.
This project demonstrates the power of transfer learning and careful preprocessing in tackling medical image classification tasks with limited data, backed by an interactive and robust API + UI pipeline.

