kaggle-epileptic-seizure-recognition

Epileptic seizure detection is a crucial task in neurological research, as timely and accurate identification of seizures can significantly improve patient care. This project explores different feature extraction techniques, feature selection strategies and classification models to develop an epileptic seizure signal classifier using EEG data.

Two feature selection approaches are evaluated: an empirical strategy, which optimizes classification performance directly, and a correlation-separability approach, which selects features based on statistical properties without relying on a predefined model. Additionally, we compare these feature selection strategies with dimensionality reduction techniques and analyze the impact of PCA across different classifiers.

Results show that SVM with an RBF kernel, combined with backward selection, achieves the best performance, reaching an F1-score of 0.985 with low variance in validation metrics. Naive Bayes performs best with the correlation-separability feature set, highlighting how different models benefit from different feature selection strategies.

Problem Context

Epilepsy is a neurological disorder characterized by recurrent seizures resulting from abnormal, excessive neuronal activity in the brain. These seizures manifest due to synchronized electrical discharges within neuronal networks, disrupting normal brain function.

Electroencephalography (EEG) is a fundamental tool in diagnosing and studying epilepsy. It records the brain's electrical activity via electrodes placed on the scalp, capturing waveforms that reflect the underlying neuronal dynamics. In healthy individuals, EEG signals display organized patterns with characteristic waveforms corresponding to different brain states, such as alpha (8–12 Hz) and beta (12–30 Hz) rhythms. However, in individuals with epilepsy, EEG signals exhibit distinctive abnormalities, including sharp spikes, spike-and-wave complexes, and high-frequency oscillations, indicative of disorganized neuronal activity.

Figure 1: Typical EEG patterns in an epileptic seizure.

Accurate identification of these epileptiform patterns is crucial for effective diagnosis and management of epilepsy. However, manual interpretation of EEG recordings is time-consuming and prone to subjective bias. This underscores the need for automated, reliable methods to detect and classify epileptic seizures. This project addresses this need by implementing pattern recognition techniques for epileptic seizure detection, aiming to enhance diagnostic precision and inform therapeutic strategies.

Dataset

The dataset used in this study is sourced from Kaggle and is originally derived from the UCI Machine Learning Repository.

Data Structure

The dataset consists of EEG recordings from 500 individuals, each with a 23.6-second brain activity recording sampled into 4097 data points. Each data point represents the EEG signal at a specific time.

To enhance usability, the original data was segmented into 1-second chunks, each containing 178 time points. This resulted in a total of 11,500 samples (500 individuals × 23 segments per individual). The dataset is formatted as a CSV file where:

Columns 1-178 contain EEG signal values.
Column 179 (y) represents the class label.

Class Labels

The dataset is originally multi-class, with labels representing different brain states:

1 – Seizure activity (epileptic seizure).
2 – EEG recorded from the brain region affected by a tumor.
3 – EEG recorded from a healthy brain region in a tumor-affected patient.
4 – EEG recorded with eyes closed.
5 – EEG recorded with eyes open.

For seizure detection tasks, this dataset is often used for binary classification, where class 1 (seizure activity) is distinguished from all other classes.

For more details, refer to the Kaggle dataset page.

Organization of this Repository

This repository is structured into a series of notebooks, each focusing on a key stage of the epileptic seizure detection pipeline. The organization is as follows:

0_Initial_Analysis_and_Preprocessing: This notebook provides a superficial exploratory data analysis (EDA), checking missing values, class imbalance and visualization of EEGs.
1_Feature_Engineering: This section extracts meaningful features from the EEG signals to enhance model performance, including time and frequency domain features, and other relevant signal processing techniques.
2_Dimensionality_Reduction: Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are applied to reduce the feature space while preserving essential information.
3_Feature_Selection: Unlike dimensionality reduction, which transforms the feature space, this step focuses on selecting the most relevant features for seizure classification. Various feature selection techniques are explored to improve model efficiency and interpretability.
4_Models_and_Evaluation: This notebook implements different classifiers, evaluates their performance using appropriate metrics, and compares various approaches to identify the most effective models and feature sets.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
data		data
images		images
0_initial_analysis_and_preprocessing.ipynb		0_initial_analysis_and_preprocessing.ipynb
1_feature_engineering.ipynb		1_feature_engineering.ipynb
2_dimensionality_reduction.ipynb		2_dimensionality_reduction.ipynb
3_feature_selection.ipynb		3_feature_selection.ipynb
4_models_and_eval.ipynb		4_models_and_eval.ipynb
LICENSE		LICENSE
README.md		README.md
slides.pdf		slides.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kaggle-epileptic-seizure-recognition

Problem Context

Dataset

Data Structure

Class Labels

Organization of this Repository

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kaggle-epileptic-seizure-recognition

Problem Context

Dataset

Data Structure

Class Labels

Organization of this Repository

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages