This repository contains the dbt (Data Build Tool) transformation layer for an e-commerce data pipeline about a company. It is designed to take raw transactional data and transform it into a structured, analytics-ready format following industry-standard data modeling practices.
The Cow-Jacket-Automation project serves as the processing engine of an e-commerce data stack. It focuses on converting raw ingestion tables into a clean star schema, enabling business intelligence and advanced analytics.
By utilizing dbt, this project ensures that all data transformations are version-controlled, documented, and rigorously tested before reaching the production warehouse.
- Business Logic: Calculating loyalty points and order metrics.
- Reliability: Enforcing data integrity with custom singular tests.
- Testing: Writing test to see if data corresponds to required company data standards
It is stored as a view in the data warehouse This layer acts as the entry point wit key features like:
- Selecting required columns
- Generic tests in the _Schema.yml file
- Tables documentation in the _source.yml file
This layer contains data transformations; joins etc. It is stored as a views . Key features include
int_customer_order_histroy.sql: Tracking customer orders history over time.int_customer_order_summary.sql: Orders summary (i.e order_items) of orders made by customers.int_sumary_cust_loyalty.sql: Investigating customers loyalty i.e sum of loyalty_points acquired over timeint_revenue_categorization.sql: Distribution of revenue generated by products ordered.int_loyalty_points_source_grouping.sql:Investigating the sources of loyalty points for customers with the lowest point displayed at the top and highest at the bottom between[Referred, Promotion, Ordered]
The "Gold" layer where business-ready data is stored. Data is stored as a table.
- Custom Data Quality Tests: Includes specialized singular tests, such as
assert_positive_numerical_values, which prevent negative values in critical financial columns (e.g., price, quantity, or points). - Modular SQL: Leveraging Jinja templates and macros to keep the codebase DRY (Don't Repeat Yourself) and maintainable.
- dbt-core (installed via pip)
- A supported data warehouse (e.g., Snowflake, BigQuery, or PostgreSQL) with schemas and databases
- Clone the Repo:
git clone [https://github.com/Human-Gechi/Cow-Jacket-Automation-Using-DBT.git](https://github.com/Human-Gechi/Cow-Jacket-Automation-Using-DBT.git) cd Cow-Jacket-Automation-Using-DBT