Skip to content

Commit 053b2c3

Browse files
author
roller100 (BearingNode)
committed
Address PR OpenLineage#186 review feedback
1 parent 444e0ac commit 053b2c3

33 files changed

+369
-961
lines changed

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,10 +170,22 @@ ignored/
170170
bin/
171171

172172
# OpenLineage event files generated during local testing
173+
openlineage_events.json
173174
openlineage_events.jsonl
175+
*/openlineage_events.json
174176
*/openlineage_events.jsonl
177+
**/events/openlineage_events.json
175178
**/events/openlineage_events.jsonl
176179

180+
# Test output files (keep directory structure, ignore contents)
181+
producer/dbt/test_output/*
182+
!producer/dbt/test_output/.gitkeep
183+
184+
# Auto-generated report files (generated by CI/CD)
185+
*_producer_report.json
186+
*_consumer_report.json
187+
generated-files/report.json
188+
177189
# Virtual environments
178190
venv/
179191
test_venv/

producer/dbt/README.md

Lines changed: 23 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -118,25 +118,16 @@ The GitHub Actions workflow:
118118

119119
---
120120

121-
### Running Tests Locally (Development & Debugging)
121+
### Local Debugging (Optional)
122122

123-
**Use this approach for iterative development, debugging, and testing changes before pushing to GitHub.**
123+
**For development debugging, you may optionally run PostgreSQL locally. The standard test environment is GitHub Actions.**
124124

125-
Local testing provides:
126-
- Faster feedback loops for development
127-
- Direct access to event files and logs
128-
- Ability to inspect database state
129-
- Control over specific test scenarios
125+
If you need to debug event generation locally:
130126

131-
#### Prerequisites
132-
133-
1. **Start PostgreSQL Container**:
127+
1. **Start PostgreSQL (Optional)**:
134128
```bash
135-
# From the producer/dbt/ directory
136-
docker-compose up -d
137-
138-
# Verify container is healthy
139-
docker-compose ps
129+
# Quick one-liner for debugging
130+
docker run -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres:15-alpine
140131
```
141132

142133
2. **Install Python Dependencies**:
@@ -159,95 +150,27 @@ Local testing provides:
159150
pip install openlineage-dbt
160151
```
161152

162-
5. **Verify dbt Connection**:
153+
3. **Run Test Scenario**:
163154
```bash
164-
cd runner/
165-
dbt debug
166-
cd ..
167-
```
168-
169-
#### Local Execution Options
170-
171-
**Option 1: Using the Test Runner CLI (Recommended)**
172-
173-
The test runner CLI provides the same orchestration used in GitHub Actions:
174-
175-
```bash
176-
# Run a specific scenario
177-
python test_runner/cli.py run-scenario \
178-
--scenario csv_to_postgres_local \
179-
--output-dir ./test_output/$(date +%s)
180-
181-
# List available scenarios
182-
python test_runner/cli.py list-scenarios
183-
```
184-
185-
**Option 2: Direct dbt-ol Execution (For debugging)**
186-
187-
For fine-grained control and debugging, run `dbt-ol` commands directly:
188-
189-
```bash
190-
cd runner/
191-
192-
# Generate events for seed operation
193-
dbt-ol seed
194-
195-
# Generate events for model execution
196-
dbt-ol run
197-
198-
# Generate events for tests
199-
dbt-ol test
200-
201-
# Inspect generated events
202-
cat ../events/openlineage_events.jsonl | jq '.'
203-
```
155+
# Using the test runner CLI (same as GitHub Actions uses)
156+
python test_runner/cli.py run-scenario \
157+
--scenario csv_to_postgres_local \
158+
--output-dir ./test_output/$(date +%s)
204159
205-
**Option 3: Legacy Shell Script (Deprecated)**
206-
207-
The `run_dbt_tests.sh` script is deprecated but still available:
208-
209-
```bash
210-
./run_dbt_tests.sh \
211-
--openlineage-directory /path/to/OpenLineage \
212-
--producer-output-events-dir ./events \
213-
--openlineage-release 2-0-2 \
214-
--report-path ./dbt_report.json
215-
```
216-
217-
#### Local vs. GitHub Actions: Key Differences
218-
219-
| Aspect | Local Testing | GitHub Actions |
220-
|--------|---------------|----------------|
221-
| **Database** | Docker Compose (manual start) | PostgreSQL service container (auto-provisioned) |
222-
| **Environment** | Uses local environment variables from `profiles.yml` | Uses workflow-defined environment variables |
223-
| **Event Output** | Writes to `events/openlineage_events.jsonl` by default | Writes to temporary directory defined by workflow |
224-
| **Validation** | Manual inspection or via test runner CLI | Automated validation against OpenLineage schemas |
225-
| **Use Case** | Development, debugging, local verification | CI/CD, PR validation, compatibility reporting |
226-
| **Cleanup** | Manual (`docker-compose down -v`) | Automatic container cleanup |
227-
228-
#### Cleaning Up Local Environment
229-
230-
```bash
231-
# Stop PostgreSQL container
232-
docker-compose down
233-
234-
# Remove PostgreSQL data volume (clean slate)
235-
docker-compose down -v
236-
237-
# Remove generated event files
238-
rm -rf events/*.jsonl test_output/
239-
```
240-
241-
---
242-
243-
### Command-Line Arguments (Legacy Script)
160+
# List available scenarios
161+
python test_runner/cli.py list-scenarios
162+
```
244163

245-
For the deprecated `run_dbt_tests.sh` script:
164+
4. **Inspect Generated Events**:
165+
```bash
166+
# View events
167+
cat events/openlineage_events.jsonl | jq '.'
168+
169+
# Or check test output directory
170+
ls -la test_output/
171+
```
246172

247-
- `--openlineage-directory` (**Required**): Path to a local clone of the OpenLineage repository
248-
- `--producer-output-events-dir`: Directory for generated OpenLineage events (Default: `events/`)
249-
- `--openlineage-release`: OpenLineage release version to validate against (Default: `2-0-2`)
250-
- `--report-path`: Path for the final JSON test report (Default: `../dbt_producer_report.json`)
173+
**Note**: Local debugging is entirely optional. All official validation happens in GitHub Actions with PostgreSQL service containers. The test runner CLI (`cli.py`) is the same code used by CI/CD, ensuring consistency.
251174

252175
## Important dbt Integration Notes
253176

producer/dbt/docker-compose.yml

Lines changed: 0 additions & 23 deletions
This file was deleted.

0 commit comments

Comments
 (0)