|
| 1 | +# TASK-094: Oracle Parity — ALTER TABLE preserves clock history |
| 2 | + |
| 3 | +## Status |
| 4 | +- [ ] Planned |
| 5 | +- [ ] Assigned |
| 6 | +- [x] In Progress |
| 7 | +- [ ] Blocked (reason: ...) |
| 8 | +- [ ] Complete |
| 9 | + |
| 10 | +## Priority |
| 11 | +high |
| 12 | + |
| 13 | +## Assigned To |
| 14 | +(unassigned) |
| 15 | + |
| 16 | +## Parent Docs / Cross-links |
| 17 | +- Rust alter logic: `core/rs/core/src/alter.rs` |
| 18 | +- Zig alter logic: `zig/src/schema_alter.zig` |
| 19 | +- Existing alter tests: `zig/harness/test-alter.sh` |
| 20 | +- Gap backlog: `research/zig-cr/92-gap-backlog.md` |
| 21 | + |
| 22 | +## Description |
| 23 | +Verify that `crsql_begin_alter` / `crsql_commit_alter` preserves existing clock history and correctly backfills new columns. |
| 24 | + |
| 25 | +This is an **oracle test**: Schema evolution is critical for long-lived databases. If Zig loses clock history during ALTER or fails to backfill new columns, data will be lost or sync will break. |
| 26 | + |
| 27 | +## Files to Modify |
| 28 | +- `zig/harness/test-alter-parity.sh` (new or extend `test-alter.sh`) |
| 29 | +- `zig/harness/test-parity.sh` (wire into suite) |
| 30 | +- `research/zig-cr/92-gap-backlog.md` |
| 31 | + |
| 32 | +## Acceptance Criteria |
| 33 | +- [x] Test creates CRR table, inserts data, records clock state. |
| 34 | +- [x] Test performs ALTER operations via `crsql_begin_alter`/`crsql_commit_alter`: |
| 35 | + 1. ADD COLUMN (nullable) |
| 36 | + 2. ADD COLUMN with DEFAULT |
| 37 | + 3. DROP COLUMN |
| 38 | + 4. ADD INDEX |
| 39 | + 5. DROP INDEX |
| 40 | +- [x] After each ALTER: |
| 41 | + - Existing clock entries are preserved (same col_version, db_version for unchanged columns) |
| 42 | + - New columns have clock entries backfilled (col_version = 1, current db_version) |
| 43 | + - Dropped columns have clock entries removed |
| 44 | +- [ ] Clock state matches exactly between implementations. **DIVERGENCE FOUND - see notes** |
| 45 | +- [x] Test covers edge cases: |
| 46 | + - ALTER on empty table |
| 47 | + - ALTER on table with 1000+ rows (batching behavior) |
| 48 | + - Multiple ALTERs in sequence |
| 49 | + - ALTER that adds column then immediately updates it |
| 50 | +- [x] Test fails if clock history diverges. |
| 51 | + |
| 52 | +## Progress Log |
| 53 | +### 2025-12-18 |
| 54 | +- Task created from oracle-based parity test suite. |
| 55 | + |
| 56 | +### 2025-12-20 |
| 57 | +- Implemented comprehensive test suite in `zig/harness/test-alter-parity.sh` |
| 58 | +- Wired test into `zig/harness/test-parity.sh` |
| 59 | +- **Key Divergence Found**: Zig backfills clock entries for new columns, Rust does NOT |
| 60 | + |
| 61 | +## Completion Notes |
| 62 | + |
| 63 | +### Test Results: PASSED: 16, FAILED: 3 |
| 64 | + |
| 65 | +### Files Modified |
| 66 | +1. `zig/harness/test-alter-parity.sh` - Comprehensive ALTER TABLE parity test (rewritten) |
| 67 | +2. `zig/harness/test-parity.sh` - Wired in test-alter-parity.sh |
| 68 | + |
| 69 | +### Test Command |
| 70 | +```bash |
| 71 | +cd /Users/tom/Developer/effect-native/cr-sqlite/zig/harness |
| 72 | +bash test-alter-parity.sh |
| 73 | +``` |
| 74 | + |
| 75 | +### Key Findings |
| 76 | + |
| 77 | +#### Behavioral Divergences (Zig vs Rust/C Oracle) |
| 78 | + |
| 79 | +1. **ADD COLUMN Backfill Behavior** (DIVERGENCE): |
| 80 | + - **Rust**: Does NOT create clock entries for new columns after ADD COLUMN |
| 81 | + - **Zig**: DOES backfill clock entries (col_version=1) for all existing rows |
| 82 | + - This is a fundamental design difference that affects sync behavior |
| 83 | + |
| 84 | +2. **DROP COLUMN Clock Cleanup** (PASS): |
| 85 | + - Both implementations correctly remove clock entries for dropped columns |
| 86 | + |
| 87 | +3. **INDEX Operations** (PASS): |
| 88 | + - ADD INDEX and DROP INDEX preserve clock state in both implementations |
| 89 | + |
| 90 | +4. **Existing History Preservation** (PASS): |
| 91 | + - Both implementations preserve existing col_version/db_version during ALTER |
| 92 | + |
| 93 | +5. **UPDATE After ADD COLUMN** (DIVERGENCE): |
| 94 | + - Rust: Creates clock entry only when column is updated (lazy) |
| 95 | + - Zig: Already has backfilled entry, UPDATE increments col_version |
| 96 | + |
| 97 | +#### Clock Table Schema Difference |
| 98 | +- Rust uses `key` column for primary key |
| 99 | +- Zig uses `pk` column for primary key |
| 100 | +- Tests normalize this by aliasing `key AS pk` for comparison |
| 101 | + |
| 102 | +### Implications |
| 103 | + |
| 104 | +The backfill divergence means: |
| 105 | +- **Zig**: After ADD COLUMN, all rows appear "changed" in crsql_changes for the new column |
| 106 | +- **Rust**: After ADD COLUMN, new column only appears in crsql_changes when explicitly updated |
| 107 | + |
| 108 | +This affects sync scenarios where: |
| 109 | +- A peer adds a column and syncs to another peer |
| 110 | +- Zig will send backfill entries for all rows |
| 111 | +- Rust will only send entries for rows that were explicitly updated |
| 112 | + |
| 113 | +### Recommendation |
| 114 | +This divergence should be documented and a decision made on which behavior is correct: |
| 115 | +1. Zig's eager backfill ensures all peers have consistent NULL/DEFAULT values tracked |
| 116 | +2. Rust's lazy approach reduces sync traffic but may cause version inconsistencies |
| 117 | + |
| 118 | +### Note on Local Extension |
| 119 | +The `lib/crsqlite.dylib` in the repo appears to have a bug with `crsql_commit_alter` ("failed compacting tables post alteration"). Tests use `nix run github:subtleGradient/sqlite-cr` which has a working cr-sqlite version. |
0 commit comments