Skip to content

Ingestion: memory_persistence_complete false negatives cause duplicate memories on retry #538

@zachg-devops

Description

@zachg-devops

Problem

The ingestion pipeline (src/agent/ingestion.rs) has two compounding issues that cause duplicate memories when using non-Anthropic models:

Issue 1: memory_persistence_complete check is too strict

When the model calls memory_persistence_complete successfully but then generates text afterward (common with non-Claude models), the chunk is marked as failed even though:

  • memory_save was called and data was committed to the DB
  • memory_persistence_complete was called and returned success
  • The model simply added a text summary after the tool call

The prompt_once() approach in ingestion (line ~529) lacks the retry mechanism that branch.rs has (MAX_MEMORY_CONTRACT_RETRIES = 2). The branch code injects a retry prompt "You must finish this memory-persistence run by calling memory_persistence_complete..." — ingestion has no equivalent.

Issue 2: Retries create duplicate memories with no deduplication

When a chunk "fails" (even falsely):

  1. Memories already committed by memory_save remain in the DB (no rollback)
  2. The chunk is NOT recorded as completed in ingestion_progress
  3. On retry, a fresh agent processes the same chunk with zero context about previously saved memories
  4. memory_save does a plain INSERT with Uuid::new_v4() — no content-hash dedup
  5. Result: duplicate memories from the same chunk content

Issue 3: ToolUseEnforcement::Auto doesn't cover non-GPT/non-Codex models

The default Auto mode only injects the enforcement prompt for models with "gpt" or "codex" in the name. Models like MiniMax, Step Flash, Gemma, DeepSeek, etc. get no enforcement prompt, making them more likely to generate text instead of calling required tools.

Combined with the ingestion prompt (prompts/en/ingestion.md.j2) telling the model to "Return a brief summary of what you extracted," the model is incentivized to generate text rather than call memory_persistence_complete.

Observed Behavior

Using openrouter/stepfun/step-3.5-flash for branch routing:

  • model calls memory_save ✅ (memories saved)
  • model calls memory_persistence_complete ✅ (success returned)
  • model generates post-tool summary text
  • chunk marked as FAILED ❌ (false negative)
  • file kept for retry → creates duplicates

With openrouter/minimax/minimax-m2.7:

  • Similar pattern but lower tool compliance (~0% success vs ~44% with Step Flash)

Suggested Fixes

  1. Add retry loop to ingestion (parity with branch.rs): When has_terminal_outcome() is false after prompt_once(), inject the retry prompt and give 2 more attempts — matching MAX_MEMORY_CONTRACT_RETRIES

  2. Check if memory_persistence_complete was called at any point, not just as the terminal action. If the tool was called and returned success, the chunk should be marked complete regardless of post-tool text.

  3. Deduplication on retry: Either (a) add content-hash dedup to memory_save, (b) record saved memory IDs per chunk in ingestion_progress and delete them before retry, or (c) use a transaction that rolls back memories if the chunk fails to complete.

  4. Widen ToolUseEnforcement::Auto to cover more model families, or document that non-Claude/non-GPT models should set tool_use_enforcement = "always" in config.

Environment

  • Spacebot deployed via Coolify (Docker)
  • Models via OpenRouter: MiniMax M2.7, Step 3.5 Flash
  • tool_use_enforcement = "always" set as workaround
  • Config: [agents.convey.routing] branch = "openrouter/stepfun/step-3.5-flash"

Workaround

Set tool_use_enforcement = "always" in config. This improves but doesn't fully resolve the issue since the ingestion code path still lacks the retry mechanism.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions