This project focuses on predicting customer churn — determining whether a customer will leave a company’s service or not.
Such predictive modeling helps businesses (especially in telecom, banking, and SaaS industries) to identify at-risk customers and take early retention actions.
- Name: Telco Customer Churn Dataset
- Source: Kaggle Dataset
- Size: 7032 records and 21 features
- Content: Customer information, service usage, contract types, payment details, and churn status.
- Inspected and handled missing values
- Removed irrelevant columns (
customerID) - Converted categorical variables using
LabelEncoder - Scaled numerical features as needed
- Visualized distributions (
countplot,boxplot,violinplot) - Checked correlations between variables (
corr()) - Explored patterns between churn and service types
The following classification models were built and evaluated:
- Logistic Regression
- Support Vector Machine (SVM)
- Decision Tree
- Random Forest
- AdaBoost
- Gradient Boosting
Hyperparameter tuning was applied using GridSearchCV for Decision Tree (as an example). For the rest of the models, the default model mode of the models themselves was used.
The models were compared using multiple performance metrics:
For numerical analyses
- classification_report which includes:
-- Accuracy -- Precision -- Recall -- F1-score
To analyze numbers in the form of a Display chart
- ConfusionMatrixDisplay
A summary table and visual comparison were created to display the results clearly.
The Gradient Boosting and Adaboost classifiers achieved the highest accuracy. Among them, Adaboost achieved an accuracy of approximately 81% on the test set.
Top influential features:
- Contract Type
- TechSupport Availability
- Customer Tenure
- pandas
- matplotlib
- seaborn
- scikit-learn
- joblib
- warnings
git clone github cd Churn_Prediction
pip install -r requirements.txt
jupyter notebook notebooks/Churn_Prediction.ipynb
Run all cells to train and evaluate the model.
Author: Ali
Field: Data Science & Machine Learning Student
Email: ali.hz87980@gmail.com
GitHub: ali-119