|
| 1 | +# Copilot Instructions for Dataform Reservation Package |
| 2 | + |
| 3 | +This document captures key learnings, debugging strategies, and architectural nuances discovered during the development of the `@masthead-data/dataform-package`. |
| 4 | + |
| 5 | +## How to Debug |
| 6 | + |
| 7 | +### 1. Tracing Compilation |
| 8 | +Dataform executes JavaScript during the compilation phase. To trace what's happening: |
| 9 | +- Use `console.error()` for debug logs. This ensures logs go to `stderr` and don't corrupt the JSON output redirected to a file. |
| 10 | +- Avoid `console.log()` inside Dataform definitions if you plan to pipe the output to a JSON parser, as it may inject plain text into the JSON stream. |
| 11 | + |
| 12 | +### 2. Inspecting the Graph |
| 13 | +To see the final state of all actions: |
| 14 | +```bash |
| 15 | +cd test-project |
| 16 | +npx @dataform/cli compile --json > compiled.json |
| 17 | +``` |
| 18 | +Inspect the `tables`, `operations`, and `assertions` arrays in the resulting JSON. Check `preOps` and `queries` for the injected `SET @@reservation` statements. |
| 19 | + |
| 20 | +### 3. Verification Script |
| 21 | +Use the provided verification script to check invariants: |
| 22 | +```bash |
| 23 | +node scripts/verify_compilation.js |
| 24 | +``` |
| 25 | +This script validates that reservations are prepended and that assertions are skipped. |
| 26 | + |
| 27 | +## Testing Configuration |
| 28 | + |
| 29 | +### Local Integration Testing |
| 30 | +The `test-project` is configured to use the local version of the package. In `test-project/package.json`: |
| 31 | +```json |
| 32 | +"dependencies": { |
| 33 | + "@masthead-data/dataform-package": "file:../" |
| 34 | +} |
| 35 | +``` |
| 36 | +**Note:** `npm ci` or `npm install` in the `test-project` caches the local package. If you make changes to `index.js` and don't see them reflected, you may need to force an update or avoid `npm ci` during rapid iteration. |
| 37 | + |
| 38 | +### Running Tests |
| 39 | + |
| 40 | +#### Matrix Testing (Default) |
| 41 | +Run from the root to test all supported versions: |
| 42 | +```bash |
| 43 | +npm test |
| 44 | +``` |
| 45 | +This automatically runs matrix tests across v2.4.2 and latest v3.X.X versions, managing config file conflicts. |
| 46 | + |
| 47 | +#### Single Version (Fast Iteration) |
| 48 | +For rapid development on the current version: |
| 49 | +```bash |
| 50 | +npm run test:single |
| 51 | +``` |
| 52 | +This runs: |
| 53 | +1. `jest`: Unit tests for helper functions |
| 54 | +2. `dataform compile`: Generates the actual project graph |
| 55 | +3. `verify_compilation.js`: In-depth JSON inspection |
| 56 | + |
| 57 | +#### Specific Version |
| 58 | +Test a single Dataform version: |
| 59 | +```bash |
| 60 | +npm test -- 2.4.2 |
| 61 | +``` |
| 62 | + |
| 63 | +**Note:** Matrix tests handle `dataform.json` (v2) vs `workflow_settings.yaml` (v3) conflicts automatically with cleanup traps. |
| 64 | + |
| 65 | +**CI Integration:** GitHub Actions runs matrix tests on every PR. |
| 66 | + |
| 67 | +## Package Architecture |
| 68 | + |
| 69 | +### Exported Methods |
| 70 | +1. **`autoAssignActions(config)`** - Primary method: global monkeypatch of `publish()`, `operate()`, `assert()` and `sqlxAction()` |
| 71 | +2. **`createReservationSetter(config)`** - Secondary method: returns a function for manual per-file application |
| 72 | +3. **`getActionName(ctx)`** - Utility: extracts action names from Dataform contexts |
| 73 | + |
| 74 | +### Key Implementation Details |
| 75 | +- **Monkeypatching Strategy:** Intercepts global methods immediately after config is loaded (use `_reservations.js` prefix to run first) |
| 76 | +- **Config Preprocessing:** Converts `actions` arrays to Sets for O(1) lookup performance |
| 77 | +- **Builder Modification:** Always modify `contextablePreOps`/`contextableQueries` on builders, not proto objects |
| 78 | +- **Assertions:** Explicitly skipped to avoid SQL syntax errors in BigQuery |
| 79 | + |
| 80 | +## Hard-Learned Dataform Nuances |
| 81 | + |
| 82 | +### 1. Builder vs Proto |
| 83 | +Dataform makes a distinction between **Action Builders** (the objects returned by `publish()`, `operate()`, etc.) and the final **Proto Objects** (the serialized state). |
| 84 | +- **Modification Point:** To ensure persistence, modifications should be made to `action.contextablePreOps` or `action.contextableQueries` on the **Builder**. If you only modify `proto.preOps`, Dataform's internal resolution logic might overwrite your changes during the final compilation phase. |
| 85 | + |
| 86 | +### 2. SQLX Pre-operations |
| 87 | +In `.sqlx` files, `pre_operations { ... }` blocks are internal to Dataform. When monkeypatching, we must ensure our reservation statement is **prepended** (using `.unshift()`) so it executes before any user-defined variables or temporary functions. |
| 88 | + |
| 89 | +### 3. The `queries()` method |
| 90 | +For `operations`, the SQL is often set via `.queries(["SQL"])`. This method can be called multiple times or late in the script. We monkeypatch this method on the builder instance to wrap the user's input, ensuring the reservation is always at the top of the list, regardless of when `queries()` is called. |
| 91 | + |
| 92 | +### 4. Assertions |
| 93 | +Assertions in Dataform are strict. They expect a single `SELECT` statement. Prepending a `SET` statement will cause a syntax error in BigQuery because assertions are often wrapped in subqueries or views by Dataform. We explicitly skip assertions in this package. |
| 94 | + |
| 95 | +## Release Process |
| 96 | + |
| 97 | +1. Update `CHANGELOG.md` with version and changes |
| 98 | +2. Bump version in `package.json` and `README.md` |
| 99 | +3. Run `npm test` to verify matrix tests pass |
| 100 | +4. Commit and push to branch |
| 101 | +5. Create PR, ensure CI passes |
| 102 | +6. Merge to main |
| 103 | +7. Tag release: `npm run release --version=x.y.z` |
| 104 | + |
| 105 | +## Known Limitations & Future Work |
| 106 | + |
| 107 | +**Performance:** `findReservation` uses linear scan (acceptable for typical project sizes <1000 actions) |
0 commit comments