This project demonstrates the use of supervised machine learning models to classify EEG-derived features for detecting epilepsy. Using a balanced dataset from 198 subjects, the study evaluates Logistic Regression, Random Forest, and Support Vector Machine (SVM) classifiers to assess diagnostic performance.
epilepsy_detection_final.ipynb— Main Jupyter notebook with code and analysisEpileptic_featured_data.csv— Dataset with 40 extracted EEG features per subjectplots/— Contains confusion matrices, feature importance, and AUC comparison charts
| Model | Accuracy | AUC |
|---|---|---|
| Logistic Regression | 1.00 | 1.00 |
| Random Forest | 1.00 | 1.00 |
| Support Vector Machine | 1.00 | 1.00 |
| Reduced RF (Top 10) | 1.00 | 1.00 |
All models achieved perfect classification performance. Feature importance analysis revealed that only 10 features are sufficient for maintaining accuracy, improving interpretability and speed.
Plots include:
- Confusion matrices for each model
- AUC comparison bar chart
- Top 10 EEG feature importances
- Clone or download this repository
- Open
epilepsy_detection_final.ipynbin Jupyter Notebook or JupyterLab - Run all cells
pandasnumpyscikit-learnmatplotlibseaborn
- Source: Public EEG dataset (Kaggle)
- 198 subjects (99 with epilepsy, 99 without)
- 40 EEG features extracted from time- and frequency-domain transformations
- Label:
stat(1 = epilepsy, 0 = non-epilepsy)
Ritu Nagar
MSc Health Data Science and Statistics
University of Plymouth
This project is for educational and research purposes only.