fix: use cached raw user message as auto-recall query by raw34 · Pull Request #579 · CortexReach/memory-lancedb-pro

raw34 · 2026-04-11T10:00:39Z

Summary

Auto-recall's recallQuery uses event.prompt directly, which on Slack includes Conversation info JSON metadata (message_id, sender_id, conversation_label, etc.) prepended by the platform adapter
This metadata pollutes the embedding vector, causing irrelevant memories to score higher than actually relevant ones
The fix uses lastRawUserMessage.get(cacheKey) || event.prompt — the same pattern already applied to gatingText (line 2245) — so the retrieval query uses clean user text instead of the noisy assembled prompt

Context

The gatingText variable (used for short-message skip and greeting detection) was already correctly patched to prefer lastRawUserMessage. However, the actual recallQuery passed to retrieveWithRetry() was missed, so retrieval itself was still affected.

The lastRawUserMessage Map is populated in the message_received hook (line 2211–2219), which strips bot mentions and caches clean user text keyed by channelId. The before_prompt_build hook shares the same cacheKey derivation, so the cached value is always available when needed.

Reproduction

Deploy with Slack channel adapter
Send a short CJK message (e.g. "我们团队谁负责项目管理")
Observe that event.prompt in before_prompt_build contains Conversation info: {"message_id":"...","sender_id":"...","conversation_label":"..."} prefix
Auto-recall returns irrelevant memories because the metadata dominates the embedding

Test plan

Verify Slack messages trigger auto-recall with clean query text (no Conversation info JSON in embedding input)
Verify non-Slack channels (Telegram, API) still work correctly (falls back to event.prompt)
Verify gatingText and recallQuery now use the same source for consistency

🤖 Generated with Claude Code

…l metadata noise On Slack, `event.prompt` in the `before_prompt_build` hook includes Conversation info JSON metadata (message_id, sender_id, conversation_label, etc.) prepended by the platform adapter. This metadata pollutes the embedding vector, causing irrelevant memories to score higher during auto-recall. The `gatingText` variable (used for skip/greeting detection) was already correctly using `lastRawUserMessage.get(cacheKey)` — the clean user text cached during `message_received`. However, the actual `recallQuery` passed to `retrieveWithRetry()` still used `event.prompt` directly, so the retrieval itself was still affected by the noise. This fix applies the same pattern to `recallQuery`: prefer the cached raw user message, falling back to `event.prompt` for non-channel triggers or when no cached message is available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

rwmjhb merged commit 6bd4b57 into CortexReach:master Apr 11, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use cached raw user message as auto-recall query#579

fix: use cached raw user message as auto-recall query#579
rwmjhb merged 1 commit intoCortexReach:masterfrom
raw34:fix/auto-recall-query-noise

raw34 commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants