triage: add 8 hypothesis invalidation tasks

subtleGradient · subtleGradient · commit 14bf47a9f6ce · 2025-12-23T00:52:50.000-05:00
Ways to experimentally invalidate 'Zig parity is complete':

- TASK-190: Extended fuzz testing with stress patterns
- TASK-191: Port Python hypothesis property tests
- TASK-192: Test against prior Rust/C database files
- TASK-193: Audit Rust integration tests for gaps
- TASK-194: Real-world app simulation (todo, chat, inventory)
- TASK-195: Adversarial/malformed input fuzzing
- TASK-196: Deep clock table internals inspection
- TASK-197: Performance regression analysis

Each task targets a different attack surface for finding divergences.
diff --git a/.tasks/triage/TASK-190-fuzz-invalidation-round2.md b/.tasks/triage/TASK-190-fuzz-invalidation-round2.md
@@ -0,0 +1,51 @@
+# TASK-190 — Fuzz Invalidation Round 2: Stress the sync protocol
+
+## Goal
+Invalidate "Zig parity is complete" hypothesis via extended fuzzing with focus on sync edge cases.
+
+## Status
+- State: triage
+- Priority: HIGH (hypothesis validation)
+- Discovered: 2025-12-23 (Round 69 follow-up)
+
+## Hypothesis to Invalidate
+"Zig CR-SQLite is functionally identical to Rust/C CR-SQLite for all sync scenarios."
+
+## Test Approach
+Extend `test-fuzz-parity.sh` with:
+
+1. **Higher iteration count** (1000+ instead of 100)
+2. **More aggressive schema generation**:
+   - Tables with 10+ columns
+   - Deep compound PKs (3-4 columns)
+   - Mixed type PKs (int + text + blob)
+3. **Chaotic operation sequences**:
+   - Rapid insert/delete/resurrect cycles
+   - Concurrent column updates on same row
+   - Interleaved multi-table operations
+4. **Sync stress patterns**:
+   - 3+ node sync topologies
+   - Out-of-order change application
+   - Partial sync followed by full sync
+5. **Value edge cases**:
+   - Very long strings (>64KB)
+   - Binary data with all byte values
+   - Unicode normalization forms
+
+## Files to Modify
+- `zig/harness/test-fuzz-parity.sh` (extend)
+- Or create new `zig/harness/test-fuzz-stress.sh`
+
+## Acceptance Criteria
+1. Either find a divergence (invalidate hypothesis) OR
+2. Complete 10,000 operations without divergence (increase confidence)
+
+## Parent Docs / Cross-links
+- Prior fuzz work: `.tasks/done/TASK-127-experimental-parity-invalidation.md`
+- Gap backlog: `research/zig-cr/92-gap-backlog.md`
+
+## Progress Log
+- 2025-12-23: Created from hypothesis invalidation request.
+
+## Completion Notes
+(Empty until done.)
diff --git a/.tasks/triage/TASK-191-python-hypothesis-suite.md b/.tasks/triage/TASK-191-python-hypothesis-suite.md
@@ -0,0 +1,45 @@
+# TASK-191 — Port Python Hypothesis Tests to Zig Parity Suite
+
+## Goal
+Port the Python property-based tests (`py/correctness/`) to the bash parity harness to invalidate "Zig parity is complete".
+
+## Status
+- State: triage
+- Priority: HIGH (these tests were designed to find edge cases)
+- Discovered: 2025-12-23 (hypothesis invalidation request)
+
+## Hypothesis to Invalidate
+The Python tests use `hypothesis` library for property-based testing. They may cover scenarios our bash tests miss.
+
+## Existing Python Tests
+Located in `py/correctness/tests/`:
+- `test_cl_merging.py` — Causal length merge logic (~1000 lines)
+- `test_sentinel_omission.py` — Sentinel emission rules
+- `test_sync.py` — Sync protocol edge cases
+
+## Test Approach
+1. **Analyze Python tests** for scenarios not covered by bash harness
+2. **Identify key properties** being tested:
+   - CL merge resolution rules
+   - Sentinel creation/omission conditions
+   - Multi-peer sync convergence
+3. **Translate to bash tests** that compare Zig vs Rust/C oracle
+
+## Files to Create/Modify
+- `zig/harness/test-cl-merge-properties.sh` (new)
+- `zig/harness/test-sentinel-properties.sh` (new)
+
+## Acceptance Criteria
+1. Port at least 3 key property tests from each Python file
+2. Run against both Zig and Rust/C oracle
+3. Either find divergence OR increase confidence
+
+## Parent Docs / Cross-links
+- Python tests: `py/correctness/tests/`
+- Gap backlog: `research/zig-cr/92-gap-backlog.md`
+
+## Progress Log
+- 2025-12-23: Created from hypothesis invalidation request.
+
+## Completion Notes
+(Empty until done.)
diff --git a/.tasks/triage/TASK-192-prior-db-oracle-parity.md b/.tasks/triage/TASK-192-prior-db-oracle-parity.md
@@ -0,0 +1,48 @@
+# TASK-192 — Test Against Prior Database Files (Golden Snapshots)
+
+## Goal
+Test Zig extension against real database files created by prior Rust/C versions to invalidate "Zig parity is complete".
+
+## Status
+- State: triage
+- Priority: HIGH (tests real-world compatibility)
+- Discovered: 2025-12-23 (hypothesis invalidation request)
+
+## Hypothesis to Invalidate
+"Zig can correctly read/write databases created by Rust/C CR-SQLite."
+
+The prior DB files exist at `py/correctness/prior-dbs/`.
+
+## Test Approach
+1. **Load prior DB with Zig** extension
+2. **Verify can read**:
+   - `crsql_db_version()` returns expected value
+   - `crsql_site_id()` returns stored ID
+   - `SELECT * FROM crsql_changes` returns expected rows
+   - Clock tables have expected structure
+3. **Verify can write**:
+   - INSERT new row → clock entries created
+   - Sync changes to another DB → converges correctly
+4. **Compare against Rust/C** doing same operations
+
+## Prior DB Files
+- `py/correctness/prior-dbs/` — examine for available versions
+
+## Files to Create
+- `zig/harness/test-prior-db-compat.sh` (new)
+
+## Acceptance Criteria
+1. Load all prior DB files without error
+2. Read operations produce identical results to Rust/C
+3. Write operations produce compatible changes
+4. Either find divergence OR confirm backward compat
+
+## Parent Docs / Cross-links
+- Prior DBs: `py/correctness/prior-dbs/`
+- Gap backlog: `research/zig-cr/92-gap-backlog.md`
+
+## Progress Log
+- 2025-12-23: Created from hypothesis invalidation request.
+
+## Completion Notes
+(Empty until done.)
diff --git a/.tasks/triage/TASK-193-rust-integration-check-port.md b/.tasks/triage/TASK-193-rust-integration-check-port.md
@@ -0,0 +1,49 @@
+# TASK-193 — Port Rust Integration Check Tests
+
+## Goal
+Port tests from `core/rs/integration_check/` to bash parity harness to invalidate "Zig parity is complete".
+
+## Status
+- State: triage
+- Priority: MEDIUM (many already covered, but check for gaps)
+- Discovered: 2025-12-23 (hypothesis invalidation request)
+
+## Hypothesis to Invalidate
+"All Rust integration tests have equivalent coverage in the Zig harness."
+
+## Rust Test Files
+Located in `core/rs/integration_check/src/t/`:
+- `automigrate.rs` — Covered by `test-automigrate.sh`
+- `backfill.rs` — Covered by `test-backfill.sh`
+- `fract.rs` — Covered by `test-fract*.sh`
+- `pack_columns.rs` — Covered by `test-unpack-columns-vtab.sh`
+- `pk_only_tables.rs` — Partially covered
+- `pk_update.rs` — Covered by `test-pk-update.sh`
+- `sync_bit_honored.rs` — Covered by `test-sync-bit-isolation.sh`
+- `tableinfo.rs` — Covered by `test-extdata.sh`
+- `teardown.rs` — Covered by `test-is-crr.sh`
+- `test_cl_set_vtab.rs` — Covered by `test-clset-vtab.sh`
+- `test_db_version.rs` — Covered by `test-db-version-parity.sh`
+
+## Test Approach
+1. **Audit each Rust test file** for specific assertions
+2. **Compare against bash test** to identify gaps
+3. **Port missing scenarios** to bash harness
+
+## Files to Create/Modify
+- Compare `core/rs/integration_check/src/t/*.rs` vs `zig/harness/test-*.sh`
+
+## Acceptance Criteria
+1. Document which Rust tests have bash equivalents
+2. Port any missing scenarios
+3. Either find divergence OR confirm coverage
+
+## Parent Docs / Cross-links
+- Rust tests: `core/rs/integration_check/src/t/`
+- Coverage map: `research/zig-cr/92-gap-backlog.md` (Coverage Map Summary section)
+
+## Progress Log
+- 2025-12-23: Created from hypothesis invalidation request.
+
+## Completion Notes
+(Empty until done.)
diff --git a/.tasks/triage/TASK-194-real-world-app-simulation.md b/.tasks/triage/TASK-194-real-world-app-simulation.md
@@ -0,0 +1,67 @@
+# TASK-194 — Real-World Application Simulation Tests
+
+## Goal
+Simulate realistic application patterns to invalidate "Zig parity is complete".
+
+## Status
+- State: triage  
+- Priority: HIGH (tests real usage, not contrived scenarios)
+- Discovered: 2025-12-23 (hypothesis invalidation request)
+
+## Hypothesis to Invalidate
+"Zig behaves correctly under realistic application workloads."
+
+## Test Scenarios
+
+### 1. Todo App Sync
+- Create tasks with nested subtasks
+- Mark complete/incomplete in different order on two devices
+- Sync and verify convergence
+
+### 2. Chat/Notes App
+- Long-running conversation with edits
+- Offline edits on multiple devices
+- Reconnect and merge
+
+### 3. Shopping Cart
+- Add/remove items rapidly
+- Update quantities concurrently
+- Apply discount codes (triggers)
+
+### 4. Collaborative Document
+- Multiple users editing same "document" (row with text blob)
+- Concurrent field updates
+- History/versioning queries
+
+### 5. Inventory Management
+- Stock count adjustments
+- Transfer between locations
+- Audit trail preservation
+
+## Test Approach
+1. **Script realistic operation sequences**
+2. **Simulate multi-device with separate DBs**
+3. **Sync via crsql_changes protocol**
+4. **Verify final state matches on all "devices"**
+5. **Compare Zig vs Rust/C behavior**
+
+## Files to Create
+- `zig/harness/test-app-todo.sh` (new)
+- `zig/harness/test-app-chat.sh` (new)
+- `zig/harness/test-app-inventory.sh` (new)
+
+## Acceptance Criteria
+1. Each app simulation runs without error
+2. All "devices" converge to same state
+3. Zig and Rust/C produce identical final state
+4. Either find divergence OR confirm real-world readiness
+
+## Parent Docs / Cross-links
+- Existing realistic tests: `test-realistic-sync.sh`, `test-realistic-offline.sh`, `test-realistic-collab.sh`
+- Gap backlog: `research/zig-cr/92-gap-backlog.md`
+
+## Progress Log
+- 2025-12-23: Created from hypothesis invalidation request.
+
+## Completion Notes
+(Empty until done.)
diff --git a/.tasks/triage/TASK-195-adversarial-input-fuzzing.md b/.tasks/triage/TASK-195-adversarial-input-fuzzing.md
@@ -0,0 +1,71 @@
+# TASK-195 — Adversarial Input Fuzzing (Malformed crsql_changes)
+
+## Goal
+Feed malformed/adversarial inputs to crsql_changes to find divergent error handling.
+
+## Status
+- State: triage
+- Priority: HIGH (security + robustness)
+- Discovered: 2025-12-23 (hypothesis invalidation request)
+
+## Hypothesis to Invalidate
+"Zig and Rust/C handle all malformed inputs identically."
+
+## Test Approach
+
+### Malformed Inputs to Generate
+1. **Invalid pk blobs**:
+   - Truncated encoding
+   - Wrong column count prefix
+   - Invalid type tags
+   - Zero-length
+   - Extremely long
+
+2. **Invalid column values**:
+   - Wrong type for column
+   - Oversized blobs
+   - Invalid UTF-8 in text
+   - NaN/Inf floats
+
+3. **Invalid metadata**:
+   - Negative col_version
+   - Negative db_version  
+   - Negative cl (causal length)
+   - Invalid site_id (wrong length)
+   - site_id = all zeros
+   - site_id = all 0xFF
+
+4. **Invalid cid (column identifier)**:
+   - Non-existent column name
+   - Empty string
+   - Very long column name
+   - Column name with special chars
+
+5. **Invalid table names**:
+   - Non-existent table
+   - System table name
+   - SQL injection attempts
+
+6. **Sequence attacks**:
+   - Same pk, different site_id, same col_version
+   - Duplicate inserts
+   - Out-of-sequence db_version
+
+## Files to Create
+- `zig/harness/test-adversarial-input.sh` (new)
+
+## Acceptance Criteria
+1. Both implementations handle malformed input gracefully (error, not crash)
+2. Error messages/codes match OR divergence is documented
+3. No data corruption from malformed input
+4. Either find handling divergence OR confirm robustness parity
+
+## Parent Docs / Cross-links
+- Existing error handling: `test-error-handling.sh`
+- Gap backlog: `research/zig-cr/92-gap-backlog.md`
+
+## Progress Log
+- 2025-12-23: Created from hypothesis invalidation request.
+
+## Completion Notes
+(Empty until done.)
diff --git a/.tasks/triage/TASK-196-clock-table-direct-inspection.md b/.tasks/triage/TASK-196-clock-table-direct-inspection.md
diff --git a/.tasks/triage/TASK-197-performance-regression-parity.md b/.tasks/triage/TASK-197-performance-regression-parity.md