This project presents a hybrid machine learning pipeline for detecting Acute Lymphoblastic Leukemia (ALL) from blood smear images.
It combines:
- Fuzzy C-Means (FCM) segmentation
- EfficientNetB4 deep feature extraction
- GPU-accelerated XGBoost classification
- Interpretability via qualitative error analysis
###🎯 Final Test Accuracy: 95.53%
##📌 1. Dataset
- Blood Cell Cancer (ALL 4-Class)
https://www.kaggle.com/datasets/mohammadamireshraghi/blood-cell-cancer-all-4class - Used only for research / academic / educational purposes.
Classes converted into:
- 0 → Benign
- 1 → Malignant (Early Pre-B, Pre-B, Pro-B)
##📌 2. Pipeline Overview
Raw Image ↓ FCM Segmentation ↓ EfficientNetB4 Feature Extraction ↓ XGBoost Classifier ↓ Benign / Malignant Prediction
✔ Unsupervised segmentation using Fuzzy C-Means
✔ Pretrained EfficientNetB4 for robust features
✔ XGBoost for efficient binary classification
✔ GPU acceleration
✔ Training & validation curves
✔ Confusion matrix + metrics
✔ TP / FP / TN / FN qualitative visualizations
pip install -r requirements.txt
Open:
segmentation_and_classification.ipynb
All steps are included for segmentation, feature extraction, training, and evaluation.
The model prints a detailed classification report including:
- Precision
- Recall
- F1-score
- Support
A qualitative visualization of predictions (TP, FP, FN, TN examples)
has been saved as:
results/qualitative_analysis.png results/segmentation_examples/
ALL_Leukemia_Detection_Model/ │ ├── results/ │ ├── qualitative_analysis.png │ └── segmentation_examples/ │ ├── src/ │ ├── fcm_segmentation.py │ ├── feature_extraction.py │ ├── xgboost_classifier.py │ └── utils.py │ ├── segmentation_and_classification.ipynb ├── requirements.txt ├── LICENSE └── README.md
Licensed under MIT License (see LICENSE file).
This project is for research and educational purposes only.
It is NOT a clinical diagnostic tool.
Tagore
Machine Learning Student
Open to internships & freelance ML work.