Skip to content

fix(memory): configurable SA timeout, tier promotion retry, orphan cleanup#2519

Merged
bug-ops merged 1 commit intomainfrom
2514-memory-sqlite-bugs
Mar 31, 2026
Merged

fix(memory): configurable SA timeout, tier promotion retry, orphan cleanup#2519
bug-ops merged 1 commit intomainfrom
2514-memory-sqlite-bugs

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Mar 31, 2026

Summary

Fixes three SQLite/memory bugs found during CI-350 live testing session.

Changes

#2514 — configurable spreading activation recall timeout

  • Added recall_timeout_ms: u64 to SpreadingActivationConfig (default 1000, up from hardcoded 500)
  • Zero-value guard clamps to 100ms with a warn log (prevents silent recall breakage)
  • Updated fetch_graph_facts to use config value; log message now includes actual timeout

#2511 — tier promotion SQLite lock contention

  • promote_to_semantic now uses begin_write (BEGIN IMMEDIATE) instead of begin (DEFERRED), eliminating the read→write lock-upgrade race
  • merge_cluster_and_promote retries the DB write up to 3 times with exponential backoff (50/100/200ms) on SQLITE_BUSY

#2507 — orphaned tool-pair messages not deleted from DB

  • sanitize_tool_pairs and strip_mid_history_orphans now return (count, Vec<i64>) with db_ids of fully-removed messages
  • load_history soft-deletes those rows non-fatally; startup warnings no longer repeat across sessions

Test plan

  • 6976/6976 existing tests pass (cargo nextest run --workspace --exclude zeph-candle --lib --bins)
  • 7 new tests added: 3 for SpreadingActivationConfig defaults/round-trip, 4 for effective_recall_timeout_ms (including zero-clamp)
  • cargo clippy --workspace --features full -- -D warnings clean
  • cargo +nightly fmt --check clean

@github-actions github-actions bot added documentation Improvements or additions to documentation memory zeph-memory crate (SQLite) rust Rust code changes core zeph-core crate config Configuration file changes bug Something isn't working size/L Large PR (201-500 lines) labels Mar 31, 2026
…eanup

- #2514: add recall_timeout_ms to SpreadingActivationConfig (default 1000ms,
  clamped from 500ms hardcoded); zero-value guard clamps to 100ms with warn
- #2511: switch promote_to_semantic to BEGIN IMMEDIATE (begin_write) to
  eliminate DEFERRED lock-upgrade race; add 3-attempt exponential backoff
  (50/100/200ms) for transient SQLITE_BUSY in merge_cluster_and_promote
- #2507: sanitize_tool_pairs and strip_mid_history_orphans now return
  (count, Vec<i64>) of removed message db_ids; load_history soft-deletes
  orphaned rows so startup warnings do not repeat across sessions
@bug-ops bug-ops force-pushed the 2514-memory-sqlite-bugs branch from 112314c to 8d1e687 Compare March 31, 2026 11:45
@bug-ops bug-ops enabled auto-merge (squash) March 31, 2026 11:45
@bug-ops bug-ops merged commit 71f2eb8 into main Mar 31, 2026
27 checks passed
@bug-ops bug-ops deleted the 2514-memory-sqlite-bugs branch March 31, 2026 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working config Configuration file changes core zeph-core crate documentation Improvements or additions to documentation memory zeph-memory crate (SQLite) rust Rust code changes size/L Large PR (201-500 lines)

Projects

None yet

1 participant