You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<summary>TASK-184: Resurrection via sentinel (25/25 pass, was 11/12)</summary>
97
+
98
+
```text
99
+
Test 1: Resurrection of Live Row via Sentinel - 5/5 PASS
100
+
Test 2: Resurrection of Dead Row via Sentinel - 4/4 PASS (was failing)
101
+
Test 3: Update on Live Row via Column - 6/6 PASS (was failing)
102
+
...
103
+
Results: 25 passed, 0 failed, 0 skipped
104
+
All resurrection parity tests PASSED
105
+
```
106
+
107
+
**Root Cause**: The sentinel merge path in `changes_vtab.zig` handled resurrection (odd CL) by only zeroing column clocks, but didn't check if the base table row actually existed. For tombstoned rows, the base table row was deleted.
108
+
109
+
**Fix**: Added `insertRowForSentinelResurrection()` in `merge_insert.zig`. Modified sentinel handling in `changes_vtab.zig` to check `rowExistsInBaseTable()` before zeroing clocks, and if row doesn't exist, call `insertRowForSentinelResurrection()` first.
110
+
</details>
111
+
112
+
<details>
113
+
<summary>TASK-185: Spurious sentinel on merge (6/6 pass, was 5/6)</summary>
114
+
115
+
```text
116
+
PASS: No sentinels created on INSERT (both = 0)
117
+
PASS: Sentinels created on DELETE (both = 800)
118
+
PASS: No sentinels created on INSERT OR REPLACE (both = 0)
119
+
PASS: No sentinels created on merge (all sites = 0) <-- was failing
120
+
PASS: Sentinels propagated correctly (Site B matches Site A = 3)
121
+
PASS: Sentinel structure matches (sampled first 5 entries)
122
+
Sentinel Parity Summary: 6 passed, 0 failed
123
+
All sentinel parity tests PASSED!
124
+
```
125
+
126
+
**Root Cause**: In `changes_vtab.zig`, the `changesUpdate` function unconditionally created sentinel entries when inserting new rows during merge. Should only create sentinels when merging an actual sentinel (cid='-1').
127
+
128
+
**Fix**: Removed unconditional sentinel creation for new rows in `changes_vtab.zig`. Added sentinel CL update for existing row column merges when cl > local_cl.
129
+
</details>
130
+
131
+
<details>
132
+
<summary>TASK-187: Star topology convergence (6/6 pass, was 5/6)</summary>
133
+
134
+
```text
135
+
Test 1: Discord corrosion (3-node) - PASS
136
+
Test 2: Extended 4-node discord - PASS
137
+
Test 3: Star topology - PASS <-- was failing
138
+
139
+
Results: 6 passed, 0 failed, 0 skipped
140
+
All multi-node sync parity tests PASSED
141
+
```
142
+
143
+
**Root Cause**: Three bugs in `merge_insert.zig`:
144
+
1.`setWinnerClock` and `setWinnerClockCached` always stored `site_id = 0` instead of preserving remote site_id
145
+
2.`getLocalColVersion` filtered `WHERE site_id = 0` which ignored remote changes
146
+
3. For tables with `a PRIMARY KEY NOT NULL` (not `INTEGER PRIMARY KEY`), operations used `WHERE rowid = ?` but rowid ≠ PK value
147
+
148
+
**Fix**: Fixed all three issues in `merge_insert.zig` - preserve site_id, remove incorrect filter, use PK column instead of rowid.
Copy file name to clipboardExpand all lines: .tasks/done/TASK-184-fix-resurrection-via-sentinel.md
+23-2Lines changed: 23 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
Fix Zig implementation to resurrect tombstoned rows when receiving a sentinel with higher CL.
5
5
6
6
## Status
7
-
- State: triage
7
+
- State: done
8
8
- Priority: HIGH (sync incompatibility)
9
9
- Discovered: Round 62 (TASK-161 test suite)
10
10
@@ -49,6 +49,27 @@ The sentinel merge path in `zig/src/merge_insert.zig` likely doesn't handle the
49
49
50
50
## Progress Log
51
51
- 2025-12-22: Created from Round 62 divergence discovery.
52
+
- 2025-12-22: Implemented fix - resurrection via sentinel now works.
52
53
53
54
## Completion Notes
54
-
(Empty until done.)
55
+
**Root Cause**: The sentinel merge path in `changes_vtab.zig` (lines 1901-1914) handled resurrection (odd CL) by only zeroing column clocks, but didn't check if the base table row actually existed. For tombstoned rows, the base table row was deleted, so `zeroClockOnResurrect` ran on a non-existent row.
56
+
57
+
**Fix Applied**:
58
+
1. Added `insertRowForSentinelResurrection()` function in `merge_insert.zig` - inserts a row with just the PK value (non-PK columns NULL) using the `__crsql_key` lookup.
59
+
2. Modified sentinel handling in `changes_vtab.zig` to check `rowExistsInBaseTable()` before zeroing clocks. If row doesn't exist, calls `insertRowForSentinelResurrection()` first.
60
+
61
+
**Files Modified**:
62
+
-`zig/src/merge_insert.zig`: Added `insertRowForSentinelResurrection()` function (lines 659-703)
63
+
-`zig/src/changes_vtab.zig`: Added row existence check + resurrection insert in sentinel handler (lines 1901-1920)
64
+
65
+
**Test Results**:
66
+
-`test-resurrection-parity.sh` Test 2 (dead via sentinel): **PASS** ✅
**Note**: Test 3c (`test-resurrection-parity.sh`) fails - this is a separate issue where column updates don't create/update the sentinel clock. This is out of scope for TASK-184.
Copy file name to clipboardExpand all lines: .tasks/done/TASK-185-fix-spurious-sentinel-on-merge.md
+33-2Lines changed: 33 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
Fix Zig implementation to NOT create sentinel entries when syncing INSERT changes to a new site.
5
5
6
6
## Status
7
-
- State: triage
7
+
- State: done
8
8
- Priority: HIGH (wire format divergence)
9
9
- Discovered: Round 62 (TASK-166 test suite)
10
10
@@ -51,6 +51,37 @@ The `changesUpdate` path in `zig/src/changes_vtab.zig` creates sentinels when in
51
51
52
52
## Progress Log
53
53
- 2025-12-22: Created from Round 62 divergence discovery.
54
+
- 2025-12-22: Fixed. Root cause confirmed.
54
55
55
56
## Completion Notes
56
-
(Empty until done.)
57
+
**Date**: 2025-12-22
58
+
59
+
### Root Cause
60
+
In `zig/src/changes_vtab.zig`, the `changesUpdate` function unconditionally created sentinel entries (cid='-1') when inserting NEW rows during merge operations. This was incorrect - sentinels should only be created when:
61
+
1. A DELETE operation is being merged (incoming cid='-1'), OR
62
+
2. Passing through an existing sentinel from source
63
+
64
+
### Changes Made
65
+
**File**: `zig/src/changes_vtab.zig`
66
+
67
+
1.**Lines 1767-1778**: Removed unconditional sentinel creation when merging new rows. Added comment explaining the correct behavior:
68
+
- Sentinels should NOT be created for regular column changes
69
+
- This matches Rust/C behavior
70
+
71
+
2.**Lines 2131-2144** (after edit ~2147-2160): Added code to update sentinel CL when merging column updates on existing rows where incoming CL > local CL. This ensures that when Site B receives a column update from Site A (where the row has been through INSERT->DELETE->INSERT), the sentinel CL is correctly advanced.
72
+
73
+
3.**Lines 1983-1996** (after edit ~1997-2010): Added code to update sentinel CL during resurrection via column update. When a tombstoned row is resurrected by a column update, the sentinel CL must also be updated to track the incoming causal length.
# TASK-187 — Fix star topology sync convergence failure
2
+
3
+
## Goal
4
+
Fix Zig implementation to correctly converge in hub-and-spoke (star) topology sync patterns.
5
+
6
+
## Status
7
+
- State: COMPLETE
8
+
- Priority: HIGH (multi-node sync broken)
9
+
- Discovered: Round 62 (TASK-179 test suite)
10
+
11
+
## Problem
12
+
In a hub-and-spoke topology where all nodes sync only through a central hub, Zig nodes fail to converge to the same state. Rust/C converges correctly.
13
+
14
+
**Test failure from `test-multinode-sync.sh`:**
15
+
```
16
+
Test 3: Star topology - FAIL
17
+
18
+
Hub data: 1|updated_by_b|c2|c3|c4
19
+
3|d1|d2|d3|d4
20
+
B data: 1|updated_by_b|c2|c3|c4
21
+
3|d1|d2|d3|d4
22
+
C data: 1|c1|c2|c3|c4 <-- wrong! didn't get B's update
23
+
3|d1|d2|d3|d4
24
+
D data: 2|d1|d2|d3|d4 <-- completely wrong rows
25
+
3|updated_by_b|b2|b3|b4
26
+
27
+
FAIL: Hub and C diverged
28
+
FAIL: Hub and D diverged
29
+
```
30
+
31
+
## Scenario
32
+
1. Create 4 databases: Hub, B, C, D
33
+
2. B, C, D each create unique rows
34
+
3. All sync to Hub (Hub has all rows)
35
+
4. Hub syncs back to all spokes (all should have same data)
36
+
5. B updates row 1, C deletes row 2
37
+
6. Sync through Hub again
38
+
7.**Expected**: All 4 nodes converge to same state
39
+
8.**Actual (Zig)**: C didn't get B's update, D has completely wrong data
40
+
41
+
## Root Cause Analysis
42
+
43
+
**Three bugs found and fixed:**
44
+
45
+
### Bug 1: site_id not preserved when relaying changes through hub
46
+
- Location: `zig/src/merge_insert.zig` in `setWinnerClock` and `setWinnerClockCached`
47
+
- Problem: The code was always storing `site_id = 0` (local) instead of preserving the original remote site_id
48
+
- Fix: Added lookup via `site_identity.getOrCreateSiteOrdinal()` to convert the 16-byte site_id blob to an ordinal and store it correctly
49
+
50
+
### Bug 2: `getLocalColVersion` filtering incorrectly by site_id
51
+
- Location: `zig/src/merge_insert.zig` in `getLocalColVersion` and `getLocalColVersionCached`
52
+
- Problem: Was filtering with `WHERE site_id = 0` which ignored remote changes
53
+
- Fix: Removed the site_id filter since clock table has unique constraint on (key, col_name)
54
+
55
+
### Bug 3 (MAIN BUG): Using `rowid` instead of PK column for table operations
56
+
- Location: `zig/src/merge_insert.zig` in `updateBaseTableColumn`, `rowExistsInBaseTable`, `deleteFromBaseTable`, and cached variants
57
+
- Problem: For tables with `a PRIMARY KEY NOT NULL` (not `INTEGER PRIMARY KEY`), SQLite's rowid is NOT aliased to the PK column. The code was using `WHERE rowid = ?` but rowid ≠ PK value.
58
+
- Example: Row with a=2 inserted first gets rowid=1, row with a=1 inserted second gets rowid=2. Using `WHERE rowid=1` updates the wrong row!
59
+
- Fix: Changed all operations to use `WHERE "pk_column" = ?` instead of `WHERE rowid = ?`
60
+
61
+
## Files Modified
62
+
-`zig/src/merge_insert.zig` — Main changes for site_id handling and rowid→PK column fix
63
+
64
+
## Acceptance Criteria
65
+
1. ✅ `bash zig/harness/test-multinode-sync.sh` — Test 3 (star topology) passes
66
+
2. ✅ All other multinode tests still pass
67
+
3. ✅ No regressions in `test-sentinel-parity.sh`
68
+
4. ✅ No regressions in `test-resurrection-parity.sh`
69
+
5. ✅ No regressions in `test-fract-parity.sh`
70
+
71
+
## Parent Docs / Cross-links
72
+
- Test: `zig/harness/test-multinode-sync.sh` (Test 3: Star topology)
- 2025-12-22: Fixed Bug 3 - rowid vs PK column mismatch in updateBaseTableColumn, rowExistsInBaseTable, deleteFromBaseTable
81
+
- 2025-12-22: All tests pass
82
+
83
+
## Completion Notes
84
+
- Fixed: 2025-12-22
85
+
- Root cause was primarily Bug 3: SQLite's rowid is only aliased to the PK column for `INTEGER PRIMARY KEY`. For other PK types like `PRIMARY KEY NOT NULL`, rowid is independent. The fix queries using actual PK column names.
86
+
- All multinode sync tests now pass with parity to Rust/C oracle.
0 commit comments