Skip to content

Commit bcc6737

Browse files
committed
changes to create demo project for dbt-model-erd
1 parent d19a36c commit bcc6737

33 files changed

+2324
-0
lines changed

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,8 @@ dbt_modules/
134134
logs/
135135
#profiles.yml
136136
dbt_packages/
137+
*.duckdb
138+
*.duckdb.wal
137139

138140
# Passwords
139141
*.pem
@@ -600,3 +602,6 @@ package-lock.json
600602
.node_install*
601603
temp/
602604
artifacts/
605+
606+
# Claude Code
607+
.claude

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@
66
- [dbt Materialize with Kafka](https://www.entechlog.com/blog/data/how-to-setup-dbt-for-materialize-db)
77
- [dbt Materialize with Redpanda](https://www.entechlog.com/blog/data/how-to-setup-dbt-for-materialize-db-with-streaming-data-from-redpanda)
88

9+
## Examples
10+
- [dbt-model-erd Example](dbt-erd/) - Comprehensive example demonstrating automatic ERD generation for dbt models
11+
912
## Notes
1013
### Time sync issue fix
1114

dbt-erd/Dockerfile

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
FROM python:3.11-slim
2+
3+
# Install system dependencies
4+
RUN apt-get update && \
5+
apt-get install -y --no-install-recommends \
6+
git \
7+
&& rm -rf /var/lib/apt/lists/*
8+
9+
# Set working directory
10+
WORKDIR /usr/app/dbt
11+
12+
# Copy requirements file
13+
COPY requirements.txt /tmp/requirements.txt
14+
15+
# Install dbt and related packages
16+
RUN pip install --no-cache-dir -r /tmp/requirements.txt
17+
18+
# Copy dbt project files
19+
COPY . /usr/app/dbt/
20+
21+
# Set environment variable for dbt profiles
22+
ENV DBT_PROFILES_DIR=/usr/app/dbt
23+
24+
# Expose port for dbt docs
25+
EXPOSE 8080
26+
27+
# Default command
28+
CMD ["bash"]

dbt-erd/README.md

Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
# dbt-model-erd Example
2+
3+
A comprehensive example demonstrating how to use [dbt-model-erd](https://github.com/entechlog/dbt-model-erd) to automatically generate Entity Relationship Diagrams (ERDs) for your dbt models.
4+
5+
## 📋 What This Example Demonstrates
6+
7+
This example showcases:
8+
9+
- **Automatic ERD Generation**: Generate interactive ERD diagrams from dbt model relationships
10+
- **Best Practices**: Follow standard data warehouse patterns (staging → prep → dw layers)
11+
- **Real-World Schema**: E-commerce data model with customers, products, orders, and sales
12+
- **Relationship Detection**: Automatically detect fact-dimension relationships using `ref()` statements
13+
- **Interactive Diagrams**: View diagrams as interactive HTML files using Mermaid.js
14+
15+
## 🏗️ Project Structure
16+
17+
```
18+
dbt-erd/
19+
├── models/
20+
│ ├── prep/ # Staging/preparation layer
21+
│ │ ├── dim/
22+
│ │ │ ├── prep__dim_customer.sql
23+
│ │ │ └── prep__dim_product.sql
24+
│ │ └── fact/
25+
│ │ └── prep__fact_order_items.sql
26+
│ └── dw/ # Data warehouse layer
27+
│ ├── dim/
28+
│ │ ├── dim_customer.sql
29+
│ │ ├── dim_product.sql
30+
│ │ ├── dim_date.sql # Generated using date spine
31+
│ │ └── schema.yml
32+
│ └── fact/
33+
│ ├── fact_orders.sql # Aggregated from order items
34+
│ ├── fact_sales.sql # Line-item level sales
35+
│ └── schema.yml
36+
├── seeds/
37+
│ ├── seed_customers.csv
38+
│ ├── seed_products.csv
39+
│ └── seed_order_items.csv
40+
├── assets/img/ # Generated ERD diagrams
41+
└── dbt_project.yml
42+
```
43+
44+
## 🎯 Data Model Overview
45+
46+
### Dimension Tables
47+
- **dim_customer**: Customer master data with segments (Standard, Premium, VIP)
48+
- **dim_product**: Product catalog with pricing, costs, and margin calculations
49+
- **dim_date**: Date dimension from 1900-2100 generated using date spine logic
50+
51+
### Fact Tables
52+
- **fact_orders**: Order-level aggregations (order totals, discounts, item counts)
53+
- **fact_sales**: Line-item level sales with profit analysis
54+
55+
## 🚀 Quick Start (One Command!)
56+
57+
### Prerequisites
58+
- [Docker](https://www.docker.com/get-started) installed (that's it!)
59+
60+
### Run Everything
61+
62+
```bash
63+
# Clone the repo
64+
git clone https://github.com/entechlog/dbt-examples.git
65+
cd dbt-examples/dbt-erd
66+
67+
# Start the demo (builds, runs models, generates docs & ERDs, serves docs)
68+
docker-compose up
69+
```
70+
71+
That's it! The container will automatically:
72+
1. ✓ Install dbt packages
73+
2. ✓ Load seed data (customers, products, orders)
74+
3. ✓ Run all dbt models (prep → dw layers)
75+
4. ✓ Run dbt tests
76+
5. ✓ Generate ERD diagrams
77+
6. ✓ Generate dbt documentation
78+
7. ✓ Start dbt docs server on http://localhost:8080
79+
80+
### View the Results
81+
82+
**dbt Documentation**
83+
- Open http://localhost:8080 in your browser
84+
- Explore lineage graph, model definitions, and column descriptions
85+
86+
**ERD Diagrams**
87+
- Open `assets/img/models/dw/fact/fact_orders_model.html`
88+
- Open `assets/img/models/dw/fact/fact_sales_model.html`
89+
- View relationships between fact and dimension tables
90+
91+
### Stop the Demo
92+
93+
```bash
94+
# Press Ctrl+C in the terminal, then:
95+
docker-compose down
96+
```
97+
98+
## 📊 Example ERD Diagrams
99+
100+
### fact_sales ERD
101+
Shows relationships between:
102+
- `dim_date``fact_sales` (via sale_date_id)
103+
- `dim_product``fact_sales` (via product_id)
104+
- `dim_customer``fact_sales` (via customer_id)
105+
106+
### fact_orders ERD
107+
Shows relationships between:
108+
- `dim_date``fact_orders` (via order_date_id)
109+
110+
## ⚙️ How dbt-model-erd Works
111+
112+
1. **Scans SQL Files**: Parses your dbt models to find `ref()` statements
113+
2. **Reads Schema Files**: Extracts column definitions and relationships from `schema.yml`
114+
3. **Detects Relationships**: Identifies foreign key relationships through:
115+
- Column naming patterns (e.g., `*_id`, `*_key`)
116+
- Relationship tests in schema.yml
117+
4. **Generates Mermaid**: Creates Mermaid ER diagrams with proper cardinality
118+
5. **Creates HTML**: Wraps diagrams in interactive HTML with Mermaid.js
119+
6. **Updates Documentation**: Adds diagram links to your schema.yml files
120+
121+
## 🔧 Configuration Options
122+
123+
Create a custom `erd_config.yml`:
124+
125+
```yaml
126+
# Mermaid theme
127+
theme: default # Options: default, neutral, forest, dark
128+
129+
# Diagram direction
130+
direction: LR # LR (left-right) or TB (top-bottom)
131+
132+
# Column display
133+
show_all_columns: true
134+
max_columns: 10
135+
136+
# Output paths
137+
output_dir: assets/img
138+
mermaid_extension: .mmd
139+
html_extension: .html
140+
```
141+
142+
## 💡 How It Works
143+
144+
This example uses **DuckDB** - an embedded database that requires no separate installation or server. Everything runs inside Docker:
145+
146+
- **No database setup needed** - DuckDB is file-based (like SQLite)
147+
- **No configuration** - profiles.yml points to local DuckDB file
148+
- **Fully isolated** - Runs in Docker container
149+
- **Real dbt workflow** - Actual seeds, models, tests, docs generation
150+
- **ERD generation** - Automatic relationship detection from schema.yml
151+
152+
## 📦 Key Features Demonstrated
153+
154+
### 1. **Layered Architecture**
155+
- **Prep Layer**: Standardization, deduplication using CTEs
156+
- **DW Layer**: Final dimensional models with business logic
157+
158+
### 2. **Date Dimension Pattern**
159+
```sql
160+
{{ dbt_utils.date_spine(
161+
datepart="day",
162+
start_date="cast('1900-01-01' as date)",
163+
end_date="cast('2100-12-31' as date)"
164+
) }}
165+
```
166+
167+
### 3. **Surrogate Key Generation**
168+
```sql
169+
{{ dbt_utils.generate_surrogate_key(['customer_key']) }} AS customer_id
170+
```
171+
172+
### 4. **Relationship Tests**
173+
```yaml
174+
- name: product_id
175+
tests:
176+
- relationships:
177+
to: ref('dim_product')
178+
field: product_id
179+
```
180+
181+
## 🔗 Links & Resources
182+
183+
- **dbt-model-erd Repository**: https://github.com/entechlog/dbt-model-erd
184+
- **Main dbt Examples**: https://github.com/entechlog/dbt-examples
185+
- **dbt Documentation**: https://docs.getdbt.com/
186+
- **Mermaid.js**: https://mermaid.js.org/
187+
188+
## 🎓 What You'll Learn
189+
190+
This example demonstrates:
191+
192+
1. **dbt-model-erd Usage** - How to automatically generate ERDs from dbt models
193+
2. **DuckDB with dbt** - Using an embedded database for local development
194+
3. **Docker Workflows** - One-command setup for reproducible environments
195+
4. **Data Modeling Patterns** - Dimensional modeling (facts & dimensions)
196+
5. **dbt Best Practices** - Layered architecture, testing, documentation
197+
198+
## 📝 Next Steps
199+
200+
1. **Extend the Model**: Add more dimensions (stores, sales reps, regions)
201+
2. **Add More Facts**: Create shipping, inventory, or return fact tables
202+
3. **Customize Diagrams**: Modify erd_config.yml for different themes
203+
4. **Integrate CI/CD**: Add ERD generation to your deployment pipeline
204+
5. **Embed in Docs**: Include diagrams in dbt documentation
205+
6. **Publish to Pages**: Add to your existing GitHub Pages site as a subfolder
206+
207+
## 🤝 Contributing
208+
209+
Found an issue or have a suggestion? Please open an issue in the [dbt-model-erd repository](https://github.com/entechlog/dbt-model-erd/issues).
210+
211+
## 📄 License
212+
213+
This example is part of the dbt-examples repository and follows the same license.
214+
215+
---
216+
217+
**Generated with** ❤️ **using [dbt-model-erd](https://github.com/entechlog/dbt-model-erd)**

dbt-erd/assets/.gitkeep

Whitespace-only changes.

0 commit comments

Comments
 (0)