You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🤖 Add integration tests for stream error recovery (no amnesia) (#333)
## Summary
Adds integration test to verify that stream error recovery preserves
context (no amnesia bug).
## Changes
- **Debug IPC for testing**: Added `DEBUG_TRIGGER_STREAM_ERROR` IPC
channel
- **StreamManager debug method**: `debugTriggerStreamError()` triggers
artificial stream errors that follow the same code path as real errors
- **Integration test**: Single error + resume scenario verifies context
preservation via **structured markers**
## Test Design
**Structured-marker approach** for precise validation:
**Test Flow:**
1. Generate unique nonce for test run (random 10-char identifier)
2. Model counts 1-100 using structured format: `${nonce}-<n>: <word>`
(e.g. `ai7qcnc20g-1: one`)
3. Collect stream deltas until ≥10 complete markers detected
4. Trigger artificial network error mid-stream
5. Resume stream and wait for completion
6. Verify final message has **both** properties:
- **(a) Prefix preservation**: Starts with exact pre-error streamed text
- **(b) Exact continuation**: Contains next sequential marker
(${nonce}-11) shortly after prefix
**Validation:**
- Pre-error content captured from stream-delta events (user-visible data
path)
- Stable prefix truncated to last complete marker line (no partial
markers)
- Assertions directly prove both amnesia-prevention properties
- No coupling to internal storage formats or metadata
**Why this approach:**
- **Precise**: Detects exact continuation (not just "some work done")
- **Unambiguous**: Random nonce makes false positives virtually
impossible
- **Robust**: Structured format less likely to confuse model than
natural language
- **Fast**: Haiku 4.5 completes in ~18-21 seconds
## Bug Fix
Also fixed event collection bug in `collectStreamUntil`: properly track
consumed deltas to avoid returning the same event multiple times.
Previous logic returned first matching event on every poll, causing
duplicate processing.
## Related
Follow-up to #331 which fixed the amnesia bug by preserving accumulated
parts on error.
## Test Results
✅ Test passes reliably in ~18-21 seconds
✅ Validates **exact** prefix preservation and continuation
✅ No flaky failures from timing issues
✅ Integration tests pass: 1 passed, 1 total
_Generated with `cmux`_
0 commit comments