Skip to content

VibeHarboe/Exploratory-Data-Analysis-in-SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis in SQL 🚀

Welcome to the Exploratory Data Analysis (EDA) in SQL repository! 🎯 This project showcases essential SQL skills and techniques learned through DataCamp's Exploratory Data Analysis in SQL course. 📚

Project Overview 📌

This repository demonstrates key concepts, methodologies, and practical applications of exploratory data analysis using PostgreSQL. You'll find structured SQL scripts, comprehensive documentation, and examples of real-world analyses. 🔍

Repository Structure 🗂️

Exploratory-Data-Analysis-in-SQL/
├── LICENSE
├── README.md
├── certificate/
│   └── README.md
├── data/
│   ├── README.md
│   └── erdiagram.png
│   ├── ev311.csv
│   ├── fortune.csv
│   ├── sql_eda_dbcreate.sql.txt
│   └── stackexchange.csv
├── docs/
│   ├── README.md
│   ├── data-types-and-casting.md
│   ├── date-truncation-and-time-series.md
│   ├── eda-overview.md
│   ├── generate-series-and-binning.md
│   ├── grouping-and-aggregation.md
│   ├── outlier-handling-and-percentiles.md
│   ├── summary-functions.md
│   ├── text-cleaning-and-standardization.md
│   ├── time-difference-and-lag.md
├── sql/
│   ├── 01_Database_Structure_and_Schema_Exploration.sql
│   ├── 02_Numeric_Summary_and_Distributions.sql
│   ├── 03_Text_and_Categorical_Analysis.sql
│   ├── 04_Event_Timing_and_Duration.sql
│   └── README.md
└── visuals/
    ├── README.md
    └── Data-distribution-example.png

Key SQL Topics 🌟

1. Database Structure 🗃️

  • Understanding tables, primary keys, foreign keys, and constraints.
  • Essential commands for database exploration: information_schema queries.

2. Numeric Data Analysis 📊

  • Aggregation and summary statistics: AVG, MIN, MAX, SUM.
  • Variance, standard deviation, correlation, and binning techniques.

3. Text Data Analysis 📝

  • Character types (char, varchar, text), case conversion (UPPER, LOWER).
  • Splitting, trimming, and concatenating text data.

4. Date and Time Analysis 📅

  • Handling dates, timestamps, and intervals.
  • Aggregation by date/time, date_trunc, extract, and generating date/time series.

Practical Examples 💡

  • SQL scripts in the /sql directory illustrate typical EDA tasks such as:

    • Summarizing data distributions and identifying outliers.
    • Cleaning text fields for consistent analysis.
    • Aggregating data by time intervals to identify trends and patterns.

Certification 🎓

Find the course certificate in the /certificate directory, confirming successful completion of DataCamp's Exploratory Data Analysis in SQL.

Documentation 📖

Detailed markdown files in /docs provide clear explanations and examples for each SQL concept, making the repository a valuable reference:

  • Data types and casting
  • Summary functions
  • Grouping and aggregation
  • Text cleaning and formatting
  • Time differences and lagging
  • Series generation and binning
  • Outlier handling and percentiles
  • Date truncation and time-series analysis

Data & Visuals 📈

  • ER diagrams and relevant images are stored in the /data directory.
  • Visual examples of data distributions and analyses are in the /visuals directory.

License 📜

This project is licensed under the MIT License. See the LICENSE file for details.

About

This repository demonstrates key concepts, methodologies, and practical applications of exploratory data analysis using PostgreSQL.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors