Skip to content

fix: Index.fts({ withPosition: true }) to enable phrase queries#404

Merged
rwmjhb merged 1 commit intoCortexReach:masterfrom
jlin53882:fix/fts-withposition-bug-402-clean
Apr 5, 2026
Merged

fix: Index.fts({ withPosition: true }) to enable phrase queries#404
rwmjhb merged 1 commit intoCortexReach:masterfrom
jlin53882:fix/fts-withposition-bug-402-clean

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

@jlin53882 jlin53882 commented Mar 29, 2026

Problem

BM25 phrase queries (e.g. "exact phrase") were silently broken because FTS indexes were created without position data (withPosition: false is the default).

Fix

Change Index.fts() to Index.fts({ withPosition: true }) in src/store.ts.

This adds position data to the FTS index, enabling phrase queries to work correctly.

Changelog

  • Fix: Index.fts({ withPosition: true }) enables phrase queries
  • Fix: Rebased onto latest upstream/master (3dc0956)
  • Note: Existing users must rebuild their FTS indexes after upgrading — see Upgrade Note below

Upgrade Note

Existing users with old FTS indexes will still get broken phrase queries after upgrade. To fix, run:

openclaw memory-pro reindex-fts

Or call rebuildFtsIndex() programmatically. This rebuilds the index with position data so phrase queries work correctly.

Verification

  • Before: Index.fts() creates index without position data → phrase queries silently fail
  • After: Index.fts({ withPosition: true }) creates index with position data → phrase queries work

Closes #402

@jlin53882
Copy link
Copy Markdown
Contributor Author

CI 分析:此 PR 的 FTS 修正與 CI 失敗無關

CI 失敗的真正原因

本次 CI 失敗的是 cli-smoke job 中的 plugin-manifest-regression.test.mjs 測試,錯誤位於第 155 行:

AssertionError [ERR_ASSERTION]: sessionMemory should stay disabled by default
+ actual - expected
+ [AsyncFunction: appendSelfImprovementNote]
- undefined

這是 upstream master 分支的 selfImprovement regression bug,與本 PR 修改的 src/store.ts FTS index 建立邏輯完全無關。

觸發原因

測試 config 為 { autoRecall: false, embedding: {...} }(無 selfImprovement block)。

Upstream master 最近有兩個 commit 造成 regression:

Commit 內容
fix: default selfImprovement.enabled to true when config block omitted... 將 selfImprovement.enabled 預設值改為 true,但實作有 bug
fix: use native fetch for Ollama embedding... 同時也造成 CI 失敗

當 selfImprovement.enabled 預設為 true 時,appendSelfImprovementNote 被錯誤地當作 command:new hook 註冊,導致 assertion api.hooks["command:new"] === undefined 失敗。

upstream master 的 CI 狀態

master 分支本身的 CI 也處於 failed 狀態(最近兩次 push 都是 failed),確認這是既有問題,不是本 PR 引入的。

本 PR 的變更確認

  • 變更檔案src/store.ts(1 行)
  • 變更內容Index.fts()Index.fts({ withPosition: true })
  • 目的:修復 phrase query 所需的 position 資料
  • 驗證:在 test-pr354 隔離環境中 30 次迭代、150 個測試 case,100% 成功
  • 與 CI 失敗的關聯性:無(隔離且最小變更)

建議

等待 upstream maintainer 修復 master 的 selfImprovement regression 後,本 PR 即可合併。

@jlin53882
Copy link
Copy Markdown
Contributor Author

Note: The CI failure is caused by a separate regression bug tracked in #405.

This PR (#404) only fixes the FTS phrase query bug in src/store.ts. The CI failure is unrelated to this fix.

Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-line fix, root cause is clear. CI failure is unrelated (#405). LGTM.

Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — changes are clean, on-topic, and well-tested. Approving.

@jlin53882
Copy link
Copy Markdown
Contributor Author

Issue #402 — Full Bug Report (for PR reference)

Environment

  • Plugin version: memory-lancedb-pro@1.1.0-beta.10
  • LanceDB version: 0.26.2 (bundled with plugin)
  • OpenClaw version: 2026.03.x (latest)
  • Platform: Windows_11 (observed on Gateway)

Error

Every BM25 query fails with this error, causing a fallback to empty results:

BM25 search failed, falling back to empty results: [Error: Failed to execute query stream: GenericFailure, lance error: Invalid user input: position is not found but required for phrase queries, try recreating the index with position, ...]
Caused by: position is not found but required for phrase queries, try recreating the index with position

Root Cause

In src/store.ts, the createFtsIndex() method creates the FTS index without position data:

await table.createIndex("text", {
  config: (lancedb as any).Index.fts(), // withPosition defaults to false
});

Index.fts() in LanceDB 0.26.2 defaults withPosition: false. When a phrase query is executed internally (e.g., multi-word search treated as phrase), LanceDB requires position data that does not exist in the index, causing the error.

The native.d.ts signature confirms:

static fts(
  withPosition?: boolean, // ← first arg, defaults to false
  baseTokenizer?: string,
  language?: string,
  ...
): Index

Impact

  • All BM25/keyword searches silently fall back to empty results
  • Users lose full-text search capability entirely
  • Error is caught and suppressed (console.warn), so users don't see the actual error — they just get no results

Workaround

The plugin already has rebuildFtsIndex() method in 1.1.0-beta.10 which can recreate the index. However, users need to know to call it manually, and the createFtsIndex() source code still uses the wrong default.

Suggested Fix

Option A — Fix at creation time (preferred, prevents future occurrences):

// src/store.ts, createFtsIndex()
config: (lancedb as any).Index.fts(true), // explicitly enable position data

Option B — Add rebuildFtsIndex to auto-trigger when FTS query fails with this specific error, instead of just falling back silently.


API Signature Correction (CRITICAL)

⚠️ The workaround suggestion above has a subtle bug — Index.fts(true) does not enable position data.

Why Index.fts(true) is wrong:

// ❌ Wrong — boolean is assigned to options, withPosition is undefined
config: (lancedb as any).Index.fts(true)

// Under the hood:
static fts(options?: object, ...): Index
// options = true (a boolean)
// options?.withPosition → undefined (boolean has no .withPosition)
// Result: position data NOT enabled, bug persists

The correct form:

// ✅ Correct — options object with explicit withPosition
config: (lancedb as any).Index.fts({ withPosition: true })

// Under the hood:
static fts(options?: object, ...): Index
// options = { withPosition: true }
// options?.withPosition → true
// Result: position data enabled, phrase queries work

This PR #404 correctly uses the object form (Index.fts({ withPosition: true })), which is the right fix.

Fixes: #402

Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean fix, well-researched — especially the catch on Index.fts(true) vs Index.fts({ withPosition: true }). The options-object API is a subtle footgun and your analysis in the comments is excellent.

One minor note for release: existing users who already have an FTS index without position data will need to call rebuildFtsIndex() once after upgrading, since this fix only applies to newly created indexes. Worth a one-liner in the changelog.

CI failure is clearly unrelated (upstream selfImprovement regression on master, tracked in #405). LGTM!

@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Apr 3, 2026

Review: Index.fts({ withPosition: true }) to enable phrase queries

Verdict: approve (conditional) | Confidence: 0.90 | Value: 65%

One-line fix, correct and high-impact — phrase queries were silently broken for all BM25 users. Code LGTM.

Before merge

  1. Rebase onto main — build failure is stale-base (plugin-manifest-regression.mjs), not introduced by this PR.
  2. Add upgrade note — existing users with old FTS indexes still get broken phrase queries after upgrade. Mention that reindex-fts CLI command (or rebuildFtsIndex()) is needed to rebuild indexes with position data. A one-liner in the changelog/release notes is sufficient.

No code changes needed

The fix itself is correct and complete for new indexes. The only gap is telling existing users how to remediate.


Auto-reviewed by auto-pr-review-orchestrator | 6 rounds | Claude + Codex adversarial

@jlin53882 jlin53882 force-pushed the fix/fts-withposition-bug-402-clean branch from 507994e to 2186131 Compare April 3, 2026 05:52
Re-apply fix after rebase onto latest upstream/master (3dc0956).
@jlin53882
Copy link
Copy Markdown
Contributor Author

Both requests addressed:

  1. Rebased onto latest upstream/master (3dc0956) — fresh rebase, no conflicts.

  2. Upgrade note added to PR description:

Existing users upgrading will still get broken phrase queries if their FTS indexes were created without position data. They need to run:

openclaw memory-pro reindex-fts

Or call rebuildFtsIndex() programmatically to rebuild the index with position data.

Ready for merge. Thanks for the review @rwmjhb!

@rwmjhb rwmjhb merged commit 16255c9 into CortexReach:master Apr 5, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] BM25 search fails: position is not found but required for phrase queries

3 participants