Skip to content

Commit 432a72a

Browse files
triage: add 7 follow-up tasks for Rust/C pks schema migration
Created comprehensive task cards for remaining work after TASK-147 partial completion (compound PK bug fix + findPkFromBlob refactor). NEW TASKS: - TASK-149: Refactor insertIntoPksTableAndGetPk for new schema Priority: HIGH - blocks sync INSERT operations - TASK-150: Eliminate base_rowid dependency from base table operations Priority: HIGH - blocks sync UPDATE/DELETE operations Functions: updateBaseTableColumn, deleteFromBaseTable, rowExistsInBaseTable - TASK-151: Update TableMergeStmts cached statement variants Priority: MEDIUM - performance optimization after main refactoring - TASK-152: Update tombstone handling to use clock table sentinels Priority: MEDIUM - correctness for DELETE sync Remove base_rowid=NULL logic, use col_name='-1' sentinel markers - TASK-153: Sweep codebase for remaining old schema references Priority: MEDIUM - cleanup task, grep for base_rowid/pks blob - TASK-154: Fix sync parity test failures Priority: HIGH - validation of schema migration work Current: 9 rows_impacted tests failing, sync operations broken - TASK-155: Review insertIntoBaseTable for new schema compatibility Priority: HIGH - part of sync INSERT path, may cascade from other tasks All tasks link back to TASK-147 parent and include: - Files to Modify (tight scope) - Acceptance Criteria (testable) - Cross-links to related work - Clear dependencies (TASK-149→150→152→154) NEXT STEP: Execute TASK-149 (insertIntoPksTableAndGetPk) to unblock sync. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
1 parent 3b9a984 commit 432a72a

7 files changed

+564
-0
lines changed
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# TASK-149 — Refactor insertIntoPksTableAndGetPk for new pks schema
2+
3+
## Goal
4+
Adapt `insertIntoPksTableAndGetPk` and related pks table insert functions to work with the new Rust/C-compatible schema where PK column values are stored directly instead of as a packed blob.
5+
6+
## Status
7+
- State: triage
8+
- Priority: high (blocks sync operations)
9+
10+
## Context
11+
The sync INSERT path currently fails because `insertIntoPksTableAndGetPk()` tries to insert into the old schema:
12+
13+
OLD SCHEMA:
14+
```sql
15+
INSERT INTO table__crsql_pks (base_rowid, pks) VALUES (?, ?)
16+
```
17+
18+
NEW SCHEMA (no base_rowid, no pks blob):
19+
```sql
20+
INSERT INTO table__crsql_pks (__crsql_key, pk_col1, pk_col2, ...) VALUES (NULL, ?, ?)
21+
-- __crsql_key is auto-increment, pk_col values come from unpacked blob
22+
```
23+
24+
Triggering issue:
25+
- `findPkFromBlob` was refactored (commit 3b9a984d) and now correctly queries new schema
26+
- When it returns `NoRows` for non-existent entry, changes_vtab calls INSERT path
27+
- INSERT path still uses old functions that reference `base_rowid` and `pks` columns
28+
29+
## Files to Modify
30+
- `zig/src/merge_insert.zig`:
31+
- `insertIntoPksTable()` (line ~658)
32+
- `insertIntoPksTableAndGetPk()` (line ~671)
33+
- `TableMergeStmts.sql_insert_pks` buffer and statement (line ~135)
34+
35+
## Acceptance Criteria
36+
1. `insertIntoPksTableAndGetPk()` must:
37+
- Unpack pk_blob into individual PK column values
38+
- Build dynamic INSERT with column names: `INSERT INTO pks (__crsql_key, "col1", "col2") VALUES (NULL, ?, ?)`
39+
- Bind unpacked values in PK order
40+
- Return the auto-generated `__crsql_key` via `last_insert_rowid()`
41+
- NOT reference `base_rowid` or `pks` columns
42+
43+
2. Remove `base_rowid` parameter from function signatures:
44+
- OLD: `insertIntoPksTableAndGetPk(db, table, base_rowid, pks_blob, len)`
45+
- NEW: `insertIntoPksTableAndGetPk(db, table, pks_blob, len)`
46+
47+
3. Update `TableMergeStmts.sql_insert_pks` to be dynamic per-table (or mark deprecated)
48+
49+
4. Test case passes:
50+
```sql
51+
CREATE TABLE foo (a INT NOT NULL PRIMARY KEY, b TEXT);
52+
SELECT crsql_as_crr('foo');
53+
INSERT INTO crsql_changes VALUES ('foo', X'0901', 'b', 'hello', 1, 1, X'01...10', 1, 0);
54+
SELECT * FROM foo; -- row with a=1, b='hello'
55+
SELECT * FROM foo__crsql_pks; -- __crsql_key=1, a=1
56+
```
57+
58+
## Parent Docs / Cross-links
59+
- Parent: TASK-147 (Rust/C pks schema migration - done)
60+
- Related: commit 3b9a984d (findPkFromBlob refactor)
61+
- Upstream: `research/zig-cr/92-gap-backlog.md` (schema compatibility)
62+
- Blocks: TASK-154 (sync parity test fixes)
63+
64+
## Progress Log
65+
- 2025-12-21: Created from TASK-147 refactoring work. Compound PK bug fixed, findPkFromBlob refactored, this is next critical blocker.
66+
67+
## Completion Notes
68+
(Empty until done.)
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# TASK-150 — Eliminate base_rowid dependency from base table operations
2+
3+
## Goal
4+
Refactor all merge_insert.zig functions that operate on the base table to use PK column values from the pks table instead of relying on a stored `base_rowid` column (which no longer exists in the new schema).
5+
6+
## Status
7+
- State: triage
8+
- Priority: high (blocks sync UPDATE/DELETE operations)
9+
10+
## Context
11+
Multiple functions in merge_insert.zig follow this pattern:
12+
1. Given `pk` (__crsql_key from pks table)
13+
2. Look up `base_rowid` from pks table: `SELECT base_rowid FROM pks WHERE pk = ?`
14+
3. Operate on base table using rowid: `UPDATE table SET col = ? WHERE rowid = base_rowid`
15+
16+
This breaks with new schema because:
17+
- The pks table has no `base_rowid` column
18+
- Instead, it has PK column values directly: `(__crsql_key, pk_col1, pk_col2, ...)`
19+
- Base table operations must use PK columns: `UPDATE table SET col = ? WHERE pk_col1 = ? AND pk_col2 = ?`
20+
21+
Affected functions discovered in Round 49 (TASK-119) and Round 58:
22+
- `getBaseRowidFromPk()` - entire function obsolete
23+
- `updateBaseTableColumn()` - uses getBaseRowidFromPk
24+
- `deleteFromBaseTable()` - uses getBaseRowidFromPk
25+
- `rowExistsInBaseTable()` - uses getBaseRowidFromPk
26+
- Cached variants: `*Cached()` versions of above
27+
28+
## Files to Modify
29+
- `zig/src/merge_insert.zig`:
30+
- `getBaseRowidFromPk()` (line ~385) - DELETE or convert to `getPkValuesFromKey()`
31+
- `updateBaseTableColumn()` (line ~250)
32+
- `deleteFromBaseTable()` (line ~415)
33+
- `rowExistsInBaseTable()` (line ~457)
34+
- `updateBaseTableColumnCached()` (line ~TBD)
35+
- `deleteFromBaseTableCached()` (line ~985)
36+
- `rowExistsInBaseTableCached()` (line ~968)
37+
- `TableMergeStmts` cached statement buffers (line ~74-78)
38+
39+
## Acceptance Criteria
40+
1. NEW helper function `getPkValuesFromKey()`:
41+
```zig
42+
fn getPkValuesFromKey(db, table_name, __crsql_key, allocator) ![]codec.Value
43+
// SELECT pk_col1, pk_col2, ... FROM table__crsql_pks WHERE __crsql_key = ?
44+
// Returns unpacked PK column values in order
45+
```
46+
47+
2. `updateBaseTableColumn()` refactored:
48+
- Call `getPkValuesFromKey(__crsql_key)` to get PK values
49+
- Build SQL: `UPDATE table SET col = ? WHERE "pk1" = ? AND "pk2" = ?`
50+
- Bind column value + all PK values
51+
- Remove all references to `base_rowid`
52+
53+
3. `deleteFromBaseTable()` refactored:
54+
- Get PK values from pks table
55+
- Build SQL: `DELETE FROM table WHERE "pk1" = ? AND "pk2" = ?`
56+
- Remove pks tombstoning logic (handled by clock table in new schema)
57+
58+
4. `rowExistsInBaseTable()` refactored:
59+
- Get PK values from pks table
60+
- Build SQL: `SELECT 1 FROM table WHERE "pk1" = ? AND "pk2" = ? LIMIT 1`
61+
62+
5. Test compound PK update:
63+
```sql
64+
CREATE TABLE foo(a INT NOT NULL, b INT NOT NULL, c TEXT, PRIMARY KEY(a,b));
65+
SELECT crsql_as_crr('foo');
66+
-- Sync insert row
67+
INSERT INTO crsql_changes VALUES ('foo', X'090109 02', 'c', 'hello', 1, 1, X'01...', 1, 0);
68+
-- Sync update column
69+
INSERT INTO crsql_changes VALUES ('foo', X'0901
70+
0902', 'c', 'world', 2, 2, X'01...', 2, 0);
71+
SELECT c FROM foo WHERE a=1 AND b=2; -- 'world'
72+
```
73+
74+
## Parent Docs / Cross-links
75+
- Parent: TASK-147 (Rust/C pks schema migration)
76+
- Related: TASK-119 (Round 49 - similar bug with pk vs base_rowid confusion)
77+
- Depends on: TASK-149 (insertIntoPksTableAndGetPk must work first for INSERT path)
78+
- Upstream: `research/zig-cr/92-gap-backlog.md`
79+
- Blocks: TASK-154 (sync parity tests)
80+
81+
## Progress Log
82+
- 2025-12-21: Created from TASK-147 work. Root cause identified: base_rowid column no longer exists, need to query PK values from pks and use in WHERE clauses.
83+
84+
## Completion Notes
85+
(Empty until done.)
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# TASK-151 — Update TableMergeStmts cached statement variants for new schema
2+
3+
## Goal
4+
Refactor the `TableMergeStmts` statement cache and all `*Cached()` function variants to work with the new pks schema (no base_rowid, no pks blob, PK columns stored directly).
5+
6+
## Status
7+
- State: triage
8+
- Priority: medium (performance optimization; uncached variants work first)
9+
10+
## Context
11+
`TableMergeStmts` provides per-table statement caching to avoid thousands of prepare/finalize cycles during sync. For a 1000-change sync, this reduces ~4000+ prepares to ~4 per table (significant perf win).
12+
13+
Current cached statements affected by schema change:
14+
```zig
15+
struct TableMergeStmts {
16+
find_pk_stmt: ?*sqlite3_stmt = null, // SELECT pk FROM pks WHERE pks = ?
17+
row_exists_base_stmt: ?*sqlite3_stmt = null, // SELECT 1 FROM table WHERE rowid = ?
18+
delete_base_stmt: ?*sqlite3_stmt = null, // DELETE FROM table WHERE rowid = ?
19+
// ... others
20+
sql_find_pk: [512]u8, // SQL buffer for find_pk
21+
sql_insert_pks: [1024]u8, // SQL buffer for insert_pks
22+
}
23+
```
24+
25+
Problem: These SQL buffers and statements are built once per table and expect old schema (pks blob, base_rowid). They need dynamic construction per PK column count.
26+
27+
## Files to Modify
28+
- `zig/src/merge_insert.zig`:
29+
- `TableMergeStmts` struct definition (line ~51)
30+
- `TableMergeStmts.init()` (line ~100)
31+
- `findPkFromBlobCached()` (line ~947)
32+
- `rowExistsInBaseTableCached()` (line ~968)
33+
- `deleteFromBaseTableCached()` (line ~985)
34+
- `updateBaseTableColumnCached()` (if exists)
35+
36+
## Acceptance Criteria
37+
1. Option A (simple): Mark cached variants as TODO and use uncached:
38+
- Document that caching is temporarily disabled pending schema stabilization
39+
- All `*Cached()` functions call uncached variants
40+
- Remove stale SQL buffers from `TableMergeStmts`
41+
42+
2. Option B (optimal): Implement full caching for new schema:
43+
- `TableMergeStmts.init()` takes `TableInfo` parameter
44+
- Dynamically build SQL with PK column count:
45+
- `find_pk`: `SELECT __crsql_key FROM pks WHERE col1=? AND col2=?`
46+
- `row_exists`: `SELECT 1 FROM table WHERE col1=? AND col2=?`
47+
- `delete`: `DELETE FROM table WHERE col1=? AND col2=?`
48+
- Store prepared statements (still cached, but schema-aware)
49+
50+
3. Performance regression test:
51+
- Sync 1000 changes to same table
52+
- Verify statement reuse (check prepare count in profiling)
53+
- Compare to baseline (should be ~same if caching works)
54+
55+
4. Choose Option A for MVP (unblock sync), Option B for optimization pass
56+
57+
## Parent Docs / Cross-links
58+
- Parent: TASK-147 (Rust/C pks schema migration)
59+
- Depends on: TASK-149 (insertIntoPksTableAndGetPk)
60+
- Depends on: TASK-150 (base table ops refactor)
61+
- Related: `zig/src/merge_insert.zig:7-17` (statement caching design doc)
62+
- Upstream: `research/zig-cr/92-gap-backlog.md`
63+
64+
## Progress Log
65+
- 2025-12-21: Created from TASK-147 work. Statement caching is perf-critical but can be temporarily disabled to unblock schema migration.
66+
67+
## Completion Notes
68+
(Empty until done.)
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# TASK-152 — Update tombstone handling to use clock table sentinels (no base_rowid = NULL)
2+
3+
## Goal
4+
Remove `base_rowid = NULL` tombstone logic from merge_insert.zig and update to use clock table sentinel markers for deletion tracking, matching Rust/C implementation.
5+
6+
## Status
7+
- State: triage
8+
- Priority: medium (correctness for DELETE sync)
9+
10+
## Context
11+
OLD TOMBSTONE MECHANISM (Zig with base_rowid):
12+
```sql
13+
-- Mark row as deleted (tombstone)
14+
UPDATE table__crsql_pks SET base_rowid = NULL WHERE pk = ?;
15+
16+
-- Check if tombstoned
17+
SELECT base_rowid FROM table__crsql_pks WHERE pk = ?;
18+
-- base_rowid IS NULL → tombstoned
19+
```
20+
21+
NEW TOMBSTONE MECHANISM (Rust/C via clock sentinels):
22+
- Pks table row is NEVER removed or marked NULL
23+
- Instead, clock table sentinel with `col_name = '-1'` tracks deletion:
24+
```sql
25+
-- Sentinel with even col_version = tombstone (deleted)
26+
INSERT INTO clock (key, col_name, col_version, ...) VALUES (pk, '-1', 2, ...);
27+
28+
-- Sentinel with odd col_version = creation marker (exists)
29+
INSERT INTO clock (key, col_name, col_version, ...) VALUES (pk, '-1', 1, ...);
30+
```
31+
32+
Verified in testing:
33+
```
34+
Before DELETE: clock has (key=1, col_name='c', col_version=1)
35+
After DELETE: clock has (key=1, col_name='-1', col_version=2)
36+
PKS table still has (key=1, a=1, b=2) unchanged
37+
```
38+
39+
Current code locations with old tombstone logic:
40+
- `deleteFromBaseTable()` - sets `base_rowid = NULL` after delete
41+
- `getBaseRowidFromPk()` - checks `base_rowid IS NULL`
42+
- `rowExistsInBaseTable()` - relies on getBaseRowidFromPk tombstone check
43+
44+
## Files to Modify
45+
- `zig/src/merge_insert.zig`:
46+
- `deleteFromBaseTable()` (line ~415) - remove pks UPDATE
47+
- `deleteFromBaseTableCached()` (line ~985) - same
48+
- `getBaseRowidFromPk()` (line ~385) - remove NULL check (or delete function entirely)
49+
- `rowExistsInBaseTable()` (line ~457) - check clock sentinel instead
50+
51+
## Acceptance Criteria
52+
1. `deleteFromBaseTable()` refactored:
53+
- Delete from base table using PK WHERE (from TASK-150)
54+
- Insert clock sentinel: `INSERT INTO clock (key, col_name, col_version, ...) VALUES (?, '-1', ?, ...)`
55+
- Do NOT update pks table (it remains unchanged)
56+
57+
2. `rowExistsInBaseTable()` checks two sources:
58+
- Base table query: `SELECT 1 FROM table WHERE pk1=? AND pk2=?`
59+
- If no row, check clock sentinel:
60+
- `SELECT col_version FROM clock WHERE key=? AND col_name='-1'`
61+
- Even col_version → tombstoned (return false)
62+
- Odd col_version or no sentinel → exists or never created
63+
64+
3. Remove all `base_rowid = NULL` and `base_rowid IS NULL` references
65+
66+
4. Test tombstone sync:
67+
```sql
68+
CREATE TABLE foo(a INT NOT NULL PRIMARY KEY, b TEXT);
69+
SELECT crsql_as_crr('foo');
70+
-- Insert via sync
71+
INSERT INTO crsql_changes VALUES ('foo', X'0901', 'b', 'data', 1, 1, X'01...', 1, 0);
72+
SELECT * FROM foo; -- row exists
73+
-- Delete via sync (negative cl)
74+
INSERT INTO crsql_changes VALUES ('foo', X'0901', '-1', NULL, 2, 2, X'01...', -2, 0);
75+
SELECT * FROM foo; -- empty
76+
SELECT col_version FROM foo__crsql_clock WHERE col_name='-1'; -- 2 (even = tombstone)
77+
SELECT * FROM foo__crsql_pks; -- still has __crsql_key=1, a=1
78+
```
79+
80+
## Parent Docs / Cross-links
81+
- Parent: TASK-147 (Rust/C pks schema migration)
82+
- Related: Explore agent findings on tombstone handling (session above)
83+
- Related: `zig/src/as_crr.zig:472-483` (tombstone design comments)
84+
- Depends on: TASK-150 (base table ops must work first)
85+
- Upstream: `research/zig-cr/92-gap-backlog.md`
86+
87+
## Progress Log
88+
- 2025-12-21: Created from TASK-147 work. Rust/C testing revealed clock sentinel mechanism (col_name='-1', even/odd col_version).
89+
90+
## Completion Notes
91+
(Empty until done.)
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# TASK-153 — Sweep codebase for remaining base_rowid / pks blob references
2+
3+
## Goal
4+
Systematically search and eliminate all remaining references to the old pks schema (`base_rowid` column, `pks BLOB` column) throughout the Zig codebase.
5+
6+
## Status
7+
- State: triage
8+
- Priority: medium (cleanup after main refactoring)
9+
10+
## Context
11+
After completing TASK-149, TASK-150, and TASK-152, there may be residual references to the old schema scattered across:
12+
- Comments and documentation
13+
- Error messages and debug logging
14+
- Dead code paths
15+
- Test fixtures and helper functions
16+
- SQL string literals in other modules
17+
18+
Known locations already refactored:
19+
- `zig/src/merge_insert.zig` - refactored in TASK-149, 150, 152
20+
- `zig/src/local_writes/after_write.zig` - uses new schema (getOrCreatePkKey)
21+
- `zig/src/as_crr.zig` - creates new schema tables
22+
23+
Potential places with residual references:
24+
- `zig/src/changes_vtab.zig` - may have comments or old code
25+
- `zig/test/` - test helpers might reference old schema
26+
- `zig/harness/` - test scripts might have stale SQL
27+
- Error messages in merge_insert.zig mentioning "base_rowid" or "pks"
28+
29+
## Files to Modify
30+
(To be determined by grep sweep; keep tight scope)
31+
- Candidates: Any `.zig`, `.sh`, `.md` files with schema references
32+
- Likely: `zig/src/changes_vtab.zig`, test files, harness scripts
33+
34+
## Acceptance Criteria
35+
1. Code sweep (mandatory):
36+
```bash
37+
cd zig
38+
grep -r "base_rowid" src/ test/ --include="*.zig" | grep -v ".swp"
39+
grep -r "pks BLOB" src/ test/ --include="*.zig" | grep -v ".swp"
40+
grep -r 'pks = ?' src/ test/ --include="*.zig" | grep -v ".swp"
41+
# All hits must be:
42+
# - False positives (e.g., "purpose_rowid" not "base_rowid")
43+
# - Historical comments explicitly marked OLD SCHEMA
44+
# - Or removed/refactored
45+
```
46+
47+
2. Comment audit:
48+
- Any comment referencing old schema must be labeled `OLD SCHEMA (pre-TASK-147):`
49+
- Or replaced with accurate new schema description
50+
51+
3. Test fixture update:
52+
- If any test helper creates manual pks entries, update to new schema
53+
- Test data generators use new column structure
54+
55+
4. Error message clarity:
56+
- Any error mentioning "pks" should reference "__crsql_pks table" or "PK columns"
57+
- No references to "base_rowid lookup" in production error paths
58+
59+
5. Documentation sweep:
60+
- Check `zig/src/merge_insert.zig` top-level comments
61+
- Check `research/zig-cr/*.md` files for stale schema diagrams
62+
- Update or mark obsolete
63+
64+
## Parent Docs / Cross-links
65+
- Parent: TASK-147 (Rust/C pks schema migration)
66+
- Depends on: TASK-149, TASK-150, TASK-152 (main refactoring complete)
67+
- Related: `.tasks/done/TASK-119` (had similar base_rowid confusion)
68+
- Upstream: `research/zig-cr/04-schema-and-metadata.md` (may need updates)
69+
70+
## Progress Log
71+
- 2025-12-21: Created from TASK-147 work. Cleanup task to ensure no stale references remain after major refactoring.
72+
73+
## Completion Notes
74+
(Empty until done.)

0 commit comments

Comments
 (0)