Skip to content

gaurav510610/Employee-Attrition-EDA-ML-PowerBI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Employee Attrition Analysis & Prediction

HR Analytics | IBM HR Dataset

Project Overview

This project analyzes employee attrition patterns to identify key factors associated with employees leaving an organization. The analysis is performed using Python for exploratory data analysis and machine learning, and insights are presented through an interactive Power BI dashboard.

The focus of the project is to derive clear, data-driven insights that can help HR teams understand attrition drivers and support informed workforce decisions.


Tools & Technologies

  • Python: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
  • Jupyter Notebook: Data analysis & modeling
  • Power BI: Interactive dashboards & KPIs
  • GitHub: Version control & project sharing

Dataset

  • Source: IBM HR Analytics Employee Attrition Dataset (Kaggle)
  • Records: 1,470 employees
  • Target Variable: Attrition (Yes / No)

Analysis Workflow

1. Data Cleaning & Preprocessing

  • Removed irrelevant columns
  • Created attrition flag and meaningful buckets (age, salary, experience, tenure)

2. Exploratory Data Analysis (EDA)

  • Demographic analysis (Age, Gender, Marital Status, Education)
  • Compensation analysis (Monthly Income, Salary Buckets)
  • Job & role analysis (Department, Job Role, Job Level)
  • Experience & engagement analysis (Experience, Years at Company, Satisfaction, Work-Life Balance)
  • Work conditions (Overtime, Business Travel, Distance from Home)

3. Feature Engineering

  • Bucket creation for continuous variables
  • Encoding categorical variables

4. Predictive Modeling

  • Logistic Regression
  • Random Forest Classifier
  • Model evaluation using Accuracy, ROC-AUC, Precision & Recall

5. Visualization

  • 3-page Power BI dashboard with slicers and KPIs

Key Insights

  • Attrition is highest among younger employees and freshers
  • Very low salary ranges show significantly higher attrition
  • Employees working overtime have much higher attrition risk
  • Entry-level job roles and lower job levels experience more attrition
  • Higher job involvement, satisfaction, and work-life balance reduce attrition
  • Employees with frequent business travel are more likely to leave

Machine Learning Summary

Exploratory machine learning models were built to understand the feasibility of predicting employee attrition based on historical HR data.

Logistic Regression

  • Accuracy: ~77%
  • ROC-AUC: ~0.80
  • Demonstrated strong recall for attrition cases, making it useful for identifying employees at higher risk of leaving.

Random Forest

  • Accuracy: ~85%
  • Highlighted important drivers such as Monthly Income, Age, Experience, Overtime, and Satisfaction levels.

Power BI Dashboard

The Power BI dashboard consists of three professional pages:

1. Executive Summary

  • Total Employees
  • Employees Left vs Stayed
  • Attrition Rate (%)
  • Attrition by Age Group
  • Attrition by Salary Range
  • Attrition by Overtime

2. Workforce & Job Analysis

  • Attrition by Department
  • Attrition by Job Role
  • Attrition by Job Level
  • Attrition by Gender & Marital Status

3. Experience & Engagement Analysis

  • Attrition by Experience Bucket
  • Attrition by Years at Company
  • Attrition by Job Satisfaction
  • Attrition by Job Involvement
  • Attrition by Work-Life Balance

Power BI Dashboard Overview

Executive Summary

Executive Summary

Workforce & Job Analysis

Workforce Analysis

Experience, Engagement & Work Conditions

Experience & Engagement


Repository Structure

Employee-Attrition-EDA-ML-PowerBI/
├── Data/ # Raw & cleaned datasets
├── notebooks/ # Jupyter notebook (EDA + ML)
├── power_bi/ # Power BI dashboard (.pbix)
├── Screenshots/ # Dashboard Screenshots
└── README.md

Business Value

The insights from this project can help HR teams:

  • Identify employee segments with higher attrition risk
  • Understand the impact of compensation, workload, and engagement factors
  • Support data-informed retention and workforce planning decisions

Author

Gaurav Singh
Data Analyst | Python | SQL | Power BI | Machine Learning

LinkedIn: https://www.linkedin.com/in/gaurav-singh-604492340/

Email: gaurav510610@gmail.com

About

End-to-end employee attrition analysis using Python (EDA & ML) and a 3-page executive Power BI dashboard.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors