Welcome to the Exploratory Data Analysis (EDA) in SQL repository! 🎯 This project showcases essential SQL skills and techniques learned through DataCamp's Exploratory Data Analysis in SQL course. 📚
This repository demonstrates key concepts, methodologies, and practical applications of exploratory data analysis using PostgreSQL. You'll find structured SQL scripts, comprehensive documentation, and examples of real-world analyses. 🔍
Exploratory-Data-Analysis-in-SQL/
├── LICENSE
├── README.md
├── certificate/
│ └── README.md
├── data/
│ ├── README.md
│ └── erdiagram.png
│ ├── ev311.csv
│ ├── fortune.csv
│ ├── sql_eda_dbcreate.sql.txt
│ └── stackexchange.csv
├── docs/
│ ├── README.md
│ ├── data-types-and-casting.md
│ ├── date-truncation-and-time-series.md
│ ├── eda-overview.md
│ ├── generate-series-and-binning.md
│ ├── grouping-and-aggregation.md
│ ├── outlier-handling-and-percentiles.md
│ ├── summary-functions.md
│ ├── text-cleaning-and-standardization.md
│ ├── time-difference-and-lag.md
├── sql/
│ ├── 01_Database_Structure_and_Schema_Exploration.sql
│ ├── 02_Numeric_Summary_and_Distributions.sql
│ ├── 03_Text_and_Categorical_Analysis.sql
│ ├── 04_Event_Timing_and_Duration.sql
│ └── README.md
└── visuals/
├── README.md
└── Data-distribution-example.png
- Understanding tables, primary keys, foreign keys, and constraints.
- Essential commands for database exploration:
information_schemaqueries.
- Aggregation and summary statistics:
AVG,MIN,MAX,SUM. - Variance, standard deviation, correlation, and binning techniques.
- Character types (
char,varchar,text), case conversion (UPPER,LOWER). - Splitting, trimming, and concatenating text data.
- Handling dates, timestamps, and intervals.
- Aggregation by date/time,
date_trunc,extract, and generating date/time series.
-
SQL scripts in the
/sqldirectory illustrate typical EDA tasks such as:- Summarizing data distributions and identifying outliers.
- Cleaning text fields for consistent analysis.
- Aggregating data by time intervals to identify trends and patterns.
Find the course certificate in the /certificate directory, confirming successful completion of DataCamp's Exploratory Data Analysis in SQL.
Detailed markdown files in /docs provide clear explanations and examples for each SQL concept, making the repository a valuable reference:
- Data types and casting
- Summary functions
- Grouping and aggregation
- Text cleaning and formatting
- Time differences and lagging
- Series generation and binning
- Outlier handling and percentiles
- Date truncation and time-series analysis
- ER diagrams and relevant images are stored in the
/datadirectory. - Visual examples of data distributions and analyses are in the
/visualsdirectory.
This project is licensed under the MIT License. See the LICENSE file for details.