SCT_DS_03

🏦 Bank Marketing Decision Tree Classifier

📋 Task Description

Build a decision tree classifier to predict whether a customer will purchase a product or service based on their demographic and behavioral data.

🔍 How was it done?

The Bank Marketing dataset from the UCI Machine Learning Repository containing 41,188 customer records was used. A decision tree classification model was built following these steps:

Data Loading & Exploration - Loaded the dataset and analyzed basic statistics
Data Preprocessing - Encoded categorical variables (job, marital status, education, etc.) using Label Encoding
Train-Test Split - Split data into 80% training and 20% testing sets
Model Building - Created a Decision Tree Classifier with max_depth=5 to prevent overfitting
Evaluation - Assessed model performance using accuracy, confusion matrix, and classification report
Feature Analysis - Identified the most important features influencing customer decisions

Python libraries including pandas, scikit-learn, matplotlib, and seaborn were used for data processing, modeling, and visualization.

📊 Dataset Source

UCI Machine Learning Repository - Bank Marketing Dataset

URL: https://archive.ics.uci.edu/dataset/222/bank+marketing
Records: 41,188 customers
Attributes: 20 features including age, job, marital status, education, balance, contact type, and campaign details
Target: Whether the client subscribed to a term deposit (yes/no)

💡 Key Findings

Test Accuracy: ~90% - Model performs well on unseen data
Class Imbalance: Only 11.3% customers subscribed to term deposits
Top Important Features:
- Duration of last contact
- Economic indicators (euribor3m, emp.var.rate)
- Number of previous contacts
- Customer age
Decision Tree Depth: 5 levels effectively capture patterns without overfitting
Model Generalization: Similar training and testing accuracy indicates good generalization

🎨 Visualizations

Confusion Matrix

Shows the model's prediction accuracy across both classes (subscribed vs not subscribed)

Feature Importance

Identifies which customer attributes most influence subscription decisions

Decision Tree Structure

Visual representation of the decision-making process at each node

🔧 Tech Stack

Python 3.8+
Pandas - Data manipulation
Scikit-learn - Machine learning model
Matplotlib - Data visualization
Seaborn - Enhanced visualizations
NumPy - Numerical operations
Jupyter Notebook - Interactive analysis

📈 Results Summary

Metric	Value	Description
Total Records	41,188	Complete dataset size
Training Set	32,950 (80%)	Data used for training
Testing Set	8,238 (20%)	Data used for evaluation
Test Accuracy	~90%	Model performance
Subscribed (Yes)	4,640 (11.3%)	Positive class
Not Subscribed (No)	36,548 (88.7%)	Negative class
Tree Depth	5 levels	Complexity control
Important Features	Duration, Economic indicators	Top predictors

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
bank_marketing_decision_tree.ipynb		bank_marketing_decision_tree.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCT_DS_03

🏦 Bank Marketing Decision Tree Classifier

📋 Task Description

🔍 How was it done?

📊 Dataset Source

💡 Key Findings

🎨 Visualizations

Confusion Matrix

Feature Importance

Decision Tree Structure

🔧 Tech Stack

📈 Results Summary

About

Uh oh!

Releases

Packages

Languages

License

sharanyaprasad/SCT_DS_03

Folders and files

Latest commit

History

Repository files navigation

SCT_DS_03

🏦 Bank Marketing Decision Tree Classifier

📋 Task Description

🔍 How was it done?

📊 Dataset Source

💡 Key Findings

🎨 Visualizations

Confusion Matrix

Feature Importance

Decision Tree Structure

🔧 Tech Stack

📈 Results Summary

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages