Skip to content

coder77ai/E-Commerce-Data-Analysis-SQL-Project-for-Customer-Behavior-Revenue-Optimization

Repository files navigation

Customer & Order Analysis SQL Project

A comprehensive SQL project demonstrating advanced SQL concepts including Joins, CTEs (Common Table Expressions), Window Functions, and Aggregations to answer key business questions about customers and orders.

⚡ Quick Start

Want to start immediately without setup? Use free public datasets!

  1. Google BigQuery (Recommended - No setup!):

    • Go to BigQuery Console
    • Open bigquery-public-data.thelook_ecommerce dataset
    • Run queries from 05_bigquery_examples.sql
    • Free: 1 TB queries/month
  2. Kaggle Datasets: Download e-commerce data and adapt queries (see 00_free_datasets_guide.md)

  3. Mode Analytics: Free SQL tutorial with pre-loaded datasets (see 00_free_datasets_guide.md)

Or use the included sample data - See "Getting Started" section below.

📋 Project Overview

This project analyzes customer behavior, order patterns, and revenue trends using SQL. It includes sample data and queries that demonstrate various SQL techniques commonly used in data analysis.

🗂️ Project Structure

  • 00_free_datasets_guide.md - Guide to free datasets (Kaggle, Mode Analytics, BigQuery)
  • 01_schema.sql - Database schema creation (tables, indexes)
  • 02_sample_data.sql - Sample data insertion scripts
  • 03_analysis_queries.sql - SQL queries demonstrating core concepts (clean, commented)
  • 04_business_questions.sql - Business-focused analysis queries (clean, commented)
  • 05_bigquery_examples.sql - Queries adapted for Google BigQuery public datasets
  • README.md - This file

🗄️ Database Schema

The project uses four main tables:

  1. customers - Customer information

    • customer_id, first_name, last_name, email, registration_date, city, country
  2. products - Product catalog

    • product_id, product_name, category, price, cost
  3. orders - Order headers

    • order_id, customer_id, order_date, status
  4. order_items - Order line items

    • order_item_id, order_id, product_id, quantity, unit_price

🔧 SQL Concepts Demonstrated

1. Joins

  • INNER JOIN - Get orders with customer details
  • LEFT JOIN - Include customers with no orders
  • Multiple Joins - Combine data from multiple tables

2. CTEs (Common Table Expressions)

  • Simple CTEs for readability
  • Multiple CTEs chained together
  • CTEs with aggregations and calculations

3. Window Functions

  • ROW_NUMBER() - Rank orders within customers
  • RANK() & DENSE_RANK() - Product sales rankings
  • LAG() & LEAD() - Compare values across rows
  • PARTITION BY - Calculate averages within groups
  • PERCENT_RANK() & CUME_DIST() - Distribution analysis
  • Running totals - Cumulative calculations

4. Aggregations

  • SUM, AVG, COUNT, MIN, MAX - Basic aggregations
  • GROUP BY - Group data by categories
  • HAVING - Filter aggregated results
  • Conditional aggregations - CASE statements in aggregations

📊 Business Questions Answered

Top Customers

  • Top 10 customers by total revenue
  • Top customers by order frequency
  • Top customers by average order value

Customer Retention

  • Monthly cohort retention rates
  • Repeat customer rate analysis
  • Time between orders (retention patterns)

Monthly Revenue

  • Monthly revenue trends
  • Month-over-month growth rates
  • Revenue by product category
  • Cumulative revenue (YTD)

Additional Insights

  • Product performance analysis
  • Customer acquisition analysis
  • Profit margin calculations

🚀 Getting Started

Option 1: Use Sample Data (Local Database)

Prerequisites

  • SQL database system (PostgreSQL, MySQL, SQL Server, SQLite, etc.)
  • SQL client or command-line tool

Setup Instructions

  1. Create the database schema:

    -- Run 01_schema.sql to create tables
  2. Insert sample data:

    -- Run 02_sample_data.sql to populate tables
  3. Run analysis queries:

    -- Run 03_analysis_queries.sql for concept demonstrations
    -- Run 04_business_questions.sql for business insights

Option 2: Use Free Public Datasets (No Setup Required!)

Google BigQuery (Recommended - No Setup!)

  1. Go to BigQuery Console
  2. Open bigquery-public-data.thelook_ecommerce dataset
  3. Run queries from 05_bigquery_examples.sql
  4. Free tier: 1 TB queries/month

Kaggle Datasets

  1. Download e-commerce datasets from Kaggle
  2. Import CSV files into your database
  3. Adapt queries from 03_analysis_queries.sql to match your dataset

Mode Analytics

  1. Sign up for free Mode Analytics account
  2. Access pre-loaded tutorial datasets
  3. Adapt queries to Mode's dataset structure

See 00_free_datasets_guide.md for detailed instructions!

Database Compatibility

The SQL syntax is written for PostgreSQL. For other databases, you may need to adjust:

  • MySQL/SQL Server: Replace DATE_TRUNC('month', date) with DATE_FORMAT(date, '%Y-%m-01') or DATETRUNC(month, date)
  • SQL Server: Replace DATEDIFF(day, date1, date2) with DATEDIFF(day, date1, date2) (same)
  • SQLite: Replace DATE_TRUNC with strftime('%Y-%m', date) || '-01' and DATEDIFF with julianday(date2) - julianday(date1)
  • String concatenation: Replace || with CONCAT() for MySQL

📈 Sample Queries

Find Top Customers

-- See Q1 in 04_business_questions.sql

Calculate Monthly Revenue

-- See Q7 in 04_business_questions.sql

Customer Retention Analysis

-- See Q4 in 04_business_questions.sql

🎯 Learning Objectives

After working through this project, you will understand:

  • How to use different types of JOINs effectively
  • When and how to use CTEs for complex queries
  • How window functions can provide powerful analytical capabilities
  • How to aggregate data for business insights
  • How to answer common business questions with SQL

📝 Notes

  • The sample data includes 12 customers, 10 products, and 25 orders
  • Data spans from January 2023 to July 2023
  • All queries are designed to be educational and demonstrate best practices

🔍 Key Features

✅ Multiple JOIN types demonstrated
✅ CTEs for complex query organization
✅ Comprehensive window function examples
✅ Various aggregation techniques
✅ Real-world business question solutions
Clean, well-commented SQL queries (every query includes purpose and explanation)
Free dataset integration (Kaggle, Mode Analytics, Google BigQuery)
BigQuery-specific examples included

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published