fix: recall exact slug match as search candidate#976
fix: recall exact slug match as search candidate#976oolong-tea-2026 wants to merge 2 commits intoopenclaw:mainfrom
Conversation
|
@oolong-tea-2026 is attempting to deploy a commit to the 0xBuns Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8380fe4072
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
convex/search.ts
Outdated
| nonSuspiciousOnly: args.nonSuspiciousOnly, | ||
| })) as SkillSearchEntry | null | ||
| if (exactSlugEntry) { | ||
| mergedMatches = [exactSlugEntry, ...mergedMatches] |
There was a problem hiding this comment.
Force exact slug hit to survive final top-N truncation
Prepending exactSlugEntry here does not guarantee recall, because the entry is still re-scored/sorted with the rest and can be dropped by .slice(0, limit) when many other matches score higher (especially since this path returns no embeddingId, so vectorScore becomes 0). In small-limit searches (e.g., limit=1) with several strong token-overlap results, the exact slug can still disappear from the returned page, which defeats the stated "always recalled" behavior.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch on the ranking concern. This fix intentionally does not force a specific rank — it ensures the exact slug match enters the candidate pool, then participates in normal scoring. With SLUG_EXACT_BOOST = 1.4, the recalled skill will very likely rank first, but the fix respects the existing scoring rules rather than overriding them.
Updated the PR title and commit message to clarify this is about recall, not guaranteed ranking.
Greptile SummaryThis PR fixes a search recall bug where a skill whose exact slug matches the query could be missing from the first page of results. The root cause is that when Key points from the review:
Confidence Score: 4/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: convex/search.test.ts
Line: 35-39
Comment:
**`lookupExactSlugHandler` declared but never called directly**
`lookupExactSlugHandler` is extracted and typed at module scope but is never invoked in any test — the new tests exercise `lookupExactSlug` only indirectly through `searchSkillsHandler`. This will likely trigger an `no-unused-vars` / `@typescript-eslint/no-unused-vars` lint warning.
Either call it in a dedicated unit test (e.g., testing `nonSuspiciousOnly`/`highlightedOnly` filtering inside `lookupExactSlug` directly), or remove the declaration and rely solely on the integration coverage provided by the `searchSkillsHandler` tests.
```suggestion
// lookupExactSlugHandler is tested indirectly through searchSkillsHandler
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 8380fe4 |
convex/search.test.ts
Outdated
| const lookupExactSlugHandler = ( | ||
| lookupExactSlug as unknown as { | ||
| _handler: (ctx: unknown, args: unknown) => Promise<SkillSearchEntry | null> | ||
| } | ||
| )._handler |
There was a problem hiding this comment.
lookupExactSlugHandler declared but never called directly
lookupExactSlugHandler is extracted and typed at module scope but is never invoked in any test — the new tests exercise lookupExactSlug only indirectly through searchSkillsHandler. This will likely trigger an no-unused-vars / @typescript-eslint/no-unused-vars lint warning.
Either call it in a dedicated unit test (e.g., testing nonSuspiciousOnly/highlightedOnly filtering inside lookupExactSlug directly), or remove the declaration and rely solely on the integration coverage provided by the searchSkillsHandler tests.
| const lookupExactSlugHandler = ( | |
| lookupExactSlug as unknown as { | |
| _handler: (ctx: unknown, args: unknown) => Promise<SkillSearchEntry | null> | |
| } | |
| )._handler | |
| // lookupExactSlugHandler is tested indirectly through searchSkillsHandler |
Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/search.test.ts
Line: 35-39
Comment:
**`lookupExactSlugHandler` declared but never called directly**
`lookupExactSlugHandler` is extracted and typed at module scope but is never invoked in any test — the new tests exercise `lookupExactSlug` only indirectly through `searchSkillsHandler`. This will likely trigger an `no-unused-vars` / `@typescript-eslint/no-unused-vars` lint warning.
Either call it in a dedicated unit test (e.g., testing `nonSuspiciousOnly`/`highlightedOnly` filtering inside `lookupExactSlug` directly), or remove the declaration and rely solely on the integration coverage provided by the `searchSkillsHandler` tests.
```suggestion
// lookupExactSlugHandler is tested indirectly through searchSkillsHandler
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Thanks — removed the unused lookupExactSlugHandler declaration in the latest force-push. lookupExactSlug is now only tested indirectly through searchSkillsHandler, which covers the actual integration path.
df87b72 to
6e66ca6
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6e66ca6be2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
convex/search.ts
Outdated
| nonSuspiciousOnly: args.nonSuspiciousOnly, | ||
| })) as SkillSearchEntry | null | ||
| if (exactSlugEntry) { | ||
| mergedMatches = [exactSlugEntry, ...mergedMatches] |
There was a problem hiding this comment.
Guarantee exact slug candidate cannot be trimmed from top-N
Prepending exactSlugEntry here does not make recall deterministic because the entry is still re-ranked and truncated by the final .sort(...).slice(0, limit) pass; this path also gives it vectorScore = 0 (no embeddingId), so it can lose to other candidates with stronger vector/popularity scores. In small-limit searches (for example limit=1 with a short slug query like foo, where many other results receive similar lexical boosts), the exact slug can still be dropped from the returned page, so the new "always recalled" guarantee is not actually enforced.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
To clarify the terminology: "recall" here refers to the candidate retrieval stage of the search pipeline (vector search → exactMatches filter → lexicalFallback → slug lookup), not the final user-facing results. The fix ensures the exact slug skill enters the candidate pool before scoring — it does not guarantee it appears in the final top-N.
The final ranking is entirely up to the existing scoreSkillResult logic. This is by design: we want fair scoring, not a forced override.
Also removed an unrelated .gitignore change from the commit.
When searching for a skill by its exact slug (e.g. 'ima-all-ai'), the result could be missing from the candidate pool because: 1. The vector search candidate pool (75-256) might not include the skill 2. matchesExactTokens uses `.some()` (by design, see openclaw#27), so unrelated skills with common tokens like 'ai' inflate exactMatches count 3. When exactMatches >= limit, lexicalFallbackSkills is skipped entirely, along with its exact slug lookup via the by_slug index Fix: after all recall stages (vector + exactMatches + lexicalFallback), reconstruct the candidate slug from query tokens and check if it exists in merged results. If missing, perform a single O(1) index lookup via the new `lookupExactSlug` internalQuery to ensure it enters the candidate pool. It then participates in normal scoring and ranking — SLUG_EXACT_BOOST (1.4) makes it very likely to rank first, but the fix does not force any particular ranking. Co-Authored-By: 戴硕 <daishuo@gmail.com>
6e66ca6 to
c4660ff
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c39848cdc8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Ensure an exact slug match is always recalled, even when the vector | ||
| // candidate pool and lexical fallback both missed it. The candidate | ||
| // slug is reconstructed from query tokens (e.g. "ima all ai" → "ima-all-ai"). | ||
| const candidateSlug = queryTokens.join("-"); |
There was a problem hiding this comment.
Preserve raw slug spelling when recalling exact slugs
Reconstructing the lookup key with queryTokens.join("-") loses valid slug spellings that publishVersionForUser still accepts in convex/lib/skillPublish.ts:85-89, notably repeated or trailing hyphens. For example, searching the exact slug foo--bar produces candidateSlug === "foo-bar", and foo- produces "foo", so this new fallback still misses the intended skill whenever the vector path saturates exactMatches and skips lexicalFallbackSkills. That means the fix does not actually restore exact-slug recall for part of the slug space the app allows today.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Slugs like foo--bar or foo- are technically accepted by the validation regex, but that is arguably a validation bug — not a recall bug. We sampled ~700 real slugs on ClawHub and found zero instances of double hyphens, trailing hyphens, or any pattern where tokenize().join("-") diverges from the stored slug. Not fixing this here.
| const candidateSlug = queryTokens.join("-"); | ||
| if ( | ||
| /^[a-z0-9][a-z0-9-]*$/.test(candidateSlug) && | ||
| !mergedMatches.some((e) => e.skill.slug === candidateSlug) |
There was a problem hiding this comment.
Gate the fallback lookup on the raw query being slug-shaped
This condition is effectively true for every non-empty search, because tokenize() in convex/lib/searchText.ts:1-10 only emits [a-z0-9]+ tokens and queryTokens.join("-") therefore always matches the slug regex. In practice, ordinary queries like weather forecast now incur an extra skills.by_slug lookup unless the merged results already contain weather-forecast, so the hot search path pays an additional Convex read on nearly every request instead of only on actual slug lookups.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
This is intentional. A user searching weather forecast (with a space) should recall the skill with slug weather-forecast — that is the whole point of this fix. The by_slug index lookup is O(1) and only fires when the slug is absent from merged results, so the cost is negligible.
Problem
When searching for a skill by its exact slug (e.g.
ima-all-ai), the skill can be missing from the first page of results.Root Cause
The search pipeline has three recall stages:
matchesExactTokensby_slugindex lookupThe exact slug skill can be missed when:
matchesExactTokensuses.some()(by design per #27), so common tokens like"ai"inflate the exactMatches countexactMatches.length >= limit,lexicalFallbackSkillsis skipped entirely — along with itsby_slugindex lookup that would have found the skillOn "Load more" (which increases
limitand re-searches), the larger candidate pool catches the skill, causing it to appear at the top of the list — confusing the user.Reproduction
ima-all-ai(an exact slug)ima-all-aiima-all-ainow appears at position Add nix-clawdbot plugin pointers to skill metadata #1Fix
After all existing recall stages, reconstruct the candidate slug from query tokens (
queryTokens.join("-")) and check if it exists in merged results. If missing, perform a single O(1) index lookup via the newlookupExactSluginternalQuery.This is a minimal, additive change:
matchesExactTokens, andlexicalFallbackSkillsremain untouchedSLUG_EXACT_BOOST+NAME_EXACT_BOOSTnaturally rank it firstChanges
convex/search.ts: AddlookupExactSluginternalQuery + post-merge slug recall check insearchSkillsconvex/search.test.ts: Add 2 new tests (recall when missed + skip when already present) + update existing test mocksTests
All 18 tests pass (including 2 new ones):
recalls exact slug match even when vector search and lexical fallback both miss itskips exact slug lookup when slug is already in merged results