The objective of this project was to apply clustering algorithms (K-Means and DBSCAN) on a dataset that is normally distributed around a mean, and analyze the results.
- Performed Exploratory Data Analysis (EDA) with heatmaps, boxplots, and distribution plots.
- Applied K-Means clustering and visualized distortion (elbow curve).
- Applied DBSCAN clustering and tuned parameters (
eps,min_samples). - Compared results from both algorithms.
- The dataset resembled a single spherical distribution with no distinct natural clusters.
- K-Means divided it into "pizza slice" shaped regions.
- DBSCAN mostly found a single cluster with varying outliers depending on parameters.
- Conclusion: The dataset is not inherently clusterable, and clustering is not meaningful here.
This project highlights that not all datasets are suitable for clustering.
Recognizing when clustering fails is just as important as when it succeeds.
- Python, NumPy, Pandas, Matplotlib, Seaborn, Scikit-learn
- Google Colab



