Skip to content

vishantrathi/world-layoffs-data-cleaning-sql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

World Layoffs Data Cleaning (MySQL Project)

Project Overview

In this project, I cleaned and transformed a real-world layoffs dataset using MySQL. The raw dataset contained duplicate records, inconsistent text formatting, null values, and incorrect date formats.

The goal was to prepare the dataset for analysis by making it clean, structured, and ready for further use in dashboards or data analysis.


Tools Used

  • MySQL Workbench
  • MySQL 8+
  • SQL (CTE, Window Functions, Joins)
  • CSV Dataset

Dataset Information

The dataset includes the following columns:

  • Company
  • Location
  • Industry
  • Total Laid Off
  • Percentage Laid Off
  • Date
  • Stage
  • Country
  • Funds Raised (Millions)

The original raw dataset is stored inside the /data folder.


Data Cleaning Steps

  1. Created staging tables to avoid modifying the original dataset.

  2. Removed duplicate records using the ROW_NUMBER() window function and deleted rows where row_num > 1.

  3. Standardized text data:

    • Trimmed extra spaces from company names
    • Standardized industry values (e.g., "Crypto%" to "Crypto")
    • Cleaned country names (removed trailing periods like "United States.")
  4. Converted the date column from text format to proper DATE format using STR_TO_DATE().

  5. Handled null and blank values:

    • Updated missing industry values using self joins
    • Removed rows where both total_laid_off and percentage_laid_off were NULL
  6. Dropped helper columns used during cleaning.


SQL Concepts Used

  • Common Table Expressions (CTE)
  • Window Functions (ROW_NUMBER)
  • Self Joins
  • String Functions (TRIM, LIKE)
  • Date Conversion (STR_TO_DATE)
  • NULL Handling

Final Outcome

After cleaning, the dataset is:

  • Free from duplicate records
  • Properly formatted
  • Standardized
  • Ready for data analysis or visualization

Your Name Aspiring Data Analyst | SQL | Data Cleaning | Analytics

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors