Skip to content

fix: use cached raw user message as auto-recall query#579

Merged
rwmjhb merged 1 commit intoCortexReach:masterfrom
raw34:fix/auto-recall-query-noise
Apr 11, 2026
Merged

fix: use cached raw user message as auto-recall query#579
rwmjhb merged 1 commit intoCortexReach:masterfrom
raw34:fix/auto-recall-query-noise

Conversation

@raw34
Copy link
Copy Markdown
Contributor

@raw34 raw34 commented Apr 11, 2026

Summary

  • Auto-recall's recallQuery uses event.prompt directly, which on Slack includes Conversation info JSON metadata (message_id, sender_id, conversation_label, etc.) prepended by the platform adapter
  • This metadata pollutes the embedding vector, causing irrelevant memories to score higher than actually relevant ones
  • The fix uses lastRawUserMessage.get(cacheKey) || event.prompt — the same pattern already applied to gatingText (line 2245) — so the retrieval query uses clean user text instead of the noisy assembled prompt

Context

The gatingText variable (used for short-message skip and greeting detection) was already correctly patched to prefer lastRawUserMessage. However, the actual recallQuery passed to retrieveWithRetry() was missed, so retrieval itself was still affected.

The lastRawUserMessage Map is populated in the message_received hook (line 2211–2219), which strips bot mentions and caches clean user text keyed by channelId. The before_prompt_build hook shares the same cacheKey derivation, so the cached value is always available when needed.

Reproduction

  1. Deploy with Slack channel adapter
  2. Send a short CJK message (e.g. "我们团队谁负责项目管理")
  3. Observe that event.prompt in before_prompt_build contains Conversation info: {"message_id":"...","sender_id":"...","conversation_label":"..."} prefix
  4. Auto-recall returns irrelevant memories because the metadata dominates the embedding

Test plan

  • Verify Slack messages trigger auto-recall with clean query text (no Conversation info JSON in embedding input)
  • Verify non-Slack channels (Telegram, API) still work correctly (falls back to event.prompt)
  • Verify gatingText and recallQuery now use the same source for consistency

🤖 Generated with Claude Code

…l metadata noise

On Slack, `event.prompt` in the `before_prompt_build` hook includes
Conversation info JSON metadata (message_id, sender_id, conversation_label,
etc.) prepended by the platform adapter. This metadata pollutes the embedding
vector, causing irrelevant memories to score higher during auto-recall.

The `gatingText` variable (used for skip/greeting detection) was already
correctly using `lastRawUserMessage.get(cacheKey)` — the clean user text
cached during `message_received`. However, the actual `recallQuery` passed
to `retrieveWithRetry()` still used `event.prompt` directly, so the
retrieval itself was still affected by the noise.

This fix applies the same pattern to `recallQuery`: prefer the cached raw
user message, falling back to `event.prompt` for non-channel triggers or
when no cached message is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rwmjhb rwmjhb merged commit 6bd4b57 into CortexReach:master Apr 11, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants