fix: recall exact slug match as search candidate by oolong-tea-2026 · Pull Request #976 · openclaw/clawhub

oolong-tea-2026 · 2026-03-17T12:03:49Z

Problem

When searching for a skill by its exact slug (e.g. ima-all-ai), the skill can be missing from the first page of results.

Root Cause

The search pipeline has three recall stages:

Vector search — semantic similarity, candidate pool of 75-256
exactMatches — filter vector candidates using matchesExactTokens
lexicalFallbackSkills — includes an exact by_slug index lookup

The exact slug skill can be missed when:

The vector candidate pool is too small to include it (step 1)
matchesExactTokens uses .some() (by design per #27), so common tokens like "ai" inflate the exactMatches count
When exactMatches.length >= limit, lexicalFallbackSkills is skipped entirely — along with its by_slug index lookup that would have found the skill

On "Load more" (which increases limit and re-searches), the larger candidate pool catches the skill, causing it to appear at the top of the list — confusing the user.

Reproduction

Go to https://clawhub.ai/skills
Search for ima-all-ai (an exact slug)
First page of results does not contain ima-all-ai
Scroll down to trigger "Loading more"
ima-all-ai now appears at position Add nix-clawdbot plugin pointers to skill metadata #1

Fix

After all existing recall stages, reconstruct the candidate slug from query tokens (queryTokens.join("-")) and check if it exists in merged results. If missing, perform a single O(1) index lookup via the new lookupExactSlug internalQuery.

This is a minimal, additive change:

No existing logic is modified — vector search, matchesExactTokens, and lexicalFallbackSkills remain untouched
The lookup is only triggered when the slug is missing from results (vast majority of searches skip it)
The recalled skill goes through normal scoring, so SLUG_EXACT_BOOST + NAME_EXACT_BOOST naturally rank it first

Changes

convex/search.ts: Add lookupExactSlug internalQuery + post-merge slug recall check in searchSkills
convex/search.test.ts: Add 2 new tests (recall when missed + skip when already present) + update existing test mocks

Tests

All 18 tests pass (including 2 new ones):

recalls exact slug match even when vector search and lexical fallback both miss it
skips exact slug lookup when slug is already in merged results

vercel · 2026-03-17T12:03:54Z

@oolong-tea-2026 is attempting to deploy a commit to the 0xBuns Team on Vercel.

A member of the Team first needs to authorize it.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8380fe4072

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-17T12:07:37Z

convex/search.ts

+        nonSuspiciousOnly: args.nonSuspiciousOnly,
+      })) as SkillSearchEntry | null
+      if (exactSlugEntry) {
+        mergedMatches = [exactSlugEntry, ...mergedMatches]


Force exact slug hit to survive final top-N truncation

Prepending exactSlugEntry here does not guarantee recall, because the entry is still re-scored/sorted with the rest and can be dropped by .slice(0, limit) when many other matches score higher (especially since this path returns no embeddingId, so vectorScore becomes 0). In small-limit searches (e.g., limit=1) with several strong token-overlap results, the exact slug can still disappear from the returned page, which defeats the stated "always recalled" behavior.

Useful? React with 👍 / 👎.

Good catch on the ranking concern. This fix intentionally does not force a specific rank — it ensures the exact slug match enters the candidate pool, then participates in normal scoring. With SLUG_EXACT_BOOST = 1.4, the recalled skill will very likely rank first, but the fix respects the existing scoring rules rather than overriding them.

Updated the PR title and commit message to clarify this is about recall, not guaranteed ranking.

greptile-apps · 2026-03-17T12:08:06Z

Greptile Summary

This PR fixes a search recall bug where a skill whose exact slug matches the query could be missing from the first page of results. The root cause is that when exactMatches.length >= limit the lexicalFallbackSkills path (which includes a by_slug lookup) is skipped entirely. The fix adds a minimal, additive post-merge check: it reconstructs the candidate slug from queryTokens.join('-'), and if that slug is absent from the merged candidate pool, fires a single O(1) by_slug index lookup via the new lookupExactSlug internalQuery. The recalled skill then flows through the existing scoring pipeline where SLUG_EXACT_BOOST + NAME_EXACT_BOOST naturally rank it first.

Key points from the review:

Logic is correct: the early-out guard (mergedMatches.some(e => e.skill.slug === candidateSlug)) prevents both duplicate entries and unnecessary lookups when the skill is already in results.
Filters are correctly re-applied: lookupExactSlug re-checks softDeletedAt, nonSuspiciousOnly, highlightedOnly, and owner existence — consistent with all other recall paths.
Scoring is sound: entries returned by lookupExactSlug have no embeddingId, so vectorScore defaults to 0; for an exact slug query the lexical boosts alone (~2.5+) comfortably outrank typical vector-only scores (~0.5).
Minor test concern: lookupExactSlugHandler is extracted at module scope in search.test.ts but never directly invoked — it may trigger an unused-variable lint warning.

Confidence Score: 4/5

This PR is safe to merge; the change is additive and correctly scoped to the missing-slug recall edge case.
The core logic is sound: the early-out check prevents duplicate entries and redundant queries, filters are consistently re-applied in lookupExactSlug, and ranking relies on the proven existing scoring pipeline. The two new tests directly cover the fixed scenario and the skip-when-present optimisation. The only finding is a minor unused-variable concern in the test file that does not affect production behaviour.
No files require special attention beyond the minor unused-variable lint issue in convex/search.test.ts.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: convex/search.test.ts
Line: 35-39

Comment:
**`lookupExactSlugHandler` declared but never called directly**

`lookupExactSlugHandler` is extracted and typed at module scope but is never invoked in any test — the new tests exercise `lookupExactSlug` only indirectly through `searchSkillsHandler`. This will likely trigger an `no-unused-vars` / `@typescript-eslint/no-unused-vars` lint warning.

Either call it in a dedicated unit test (e.g., testing `nonSuspiciousOnly`/`highlightedOnly` filtering inside `lookupExactSlug` directly), or remove the declaration and rely solely on the integration coverage provided by the `searchSkillsHandler` tests.

```suggestion
// lookupExactSlugHandler is tested indirectly through searchSkillsHandler
```

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: 8380fe4}

greptile-apps · 2026-03-17T12:08:09Z

convex/search.test.ts

+const lookupExactSlugHandler = (
+  lookupExactSlug as unknown as {
+    _handler: (ctx: unknown, args: unknown) => Promise<SkillSearchEntry | null>
+  }
+)._handler


lookupExactSlugHandler declared but never called directly

lookupExactSlugHandler is extracted and typed at module scope but is never invoked in any test — the new tests exercise lookupExactSlug only indirectly through searchSkillsHandler. This will likely trigger an no-unused-vars / @typescript-eslint/no-unused-vars lint warning.

Either call it in a dedicated unit test (e.g., testing nonSuspiciousOnly/highlightedOnly filtering inside lookupExactSlug directly), or remove the declaration and rely solely on the integration coverage provided by the searchSkillsHandler tests.

Suggested change

const lookupExactSlugHandler = (

lookupExactSlug as unknown as {

_handler: (ctx: unknown, args: unknown) => Promise<SkillSearchEntry | null>

}

)._handler

// lookupExactSlugHandler is tested indirectly through searchSkillsHandler

Prompt To Fix With AI

This is a comment left during a code review. Path: convex/search.test.ts Line: 35-39 Comment: **`lookupExactSlugHandler` declared but never called directly** `lookupExactSlugHandler` is extracted and typed at module scope but is never invoked in any test — the new tests exercise `lookupExactSlug` only indirectly through `searchSkillsHandler`. This will likely trigger an `no-unused-vars` / `@typescript-eslint/no-unused-vars` lint warning. Either call it in a dedicated unit test (e.g., testing `nonSuspiciousOnly`/`highlightedOnly` filtering inside `lookupExactSlug` directly), or remove the declaration and rely solely on the integration coverage provided by the `searchSkillsHandler` tests. ```suggestion // lookupExactSlugHandler is tested indirectly through searchSkillsHandler ``` How can I resolve this? If you propose a fix, please make it concise.

Thanks — removed the unused lookupExactSlugHandler declaration in the latest force-push. lookupExactSlug is now only tested indirectly through searchSkillsHandler, which covers the actual integration path.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6e66ca6be2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-17T12:34:41Z

convex/search.ts

+        nonSuspiciousOnly: args.nonSuspiciousOnly,
+      })) as SkillSearchEntry | null
+      if (exactSlugEntry) {
+        mergedMatches = [exactSlugEntry, ...mergedMatches]


Guarantee exact slug candidate cannot be trimmed from top-N

Prepending exactSlugEntry here does not make recall deterministic because the entry is still re-ranked and truncated by the final .sort(...).slice(0, limit) pass; this path also gives it vectorScore = 0 (no embeddingId), so it can lose to other candidates with stronger vector/popularity scores. In small-limit searches (for example limit=1 with a short slug query like foo, where many other results receive similar lexical boosts), the exact slug can still be dropped from the returned page, so the new "always recalled" guarantee is not actually enforced.

Useful? React with 👍 / 👎.

To clarify the terminology: "recall" here refers to the candidate retrieval stage of the search pipeline (vector search → exactMatches filter → lexicalFallback → slug lookup), not the final user-facing results. The fix ensures the exact slug skill enters the candidate pool before scoring — it does not guarantee it appears in the final top-N.

The final ranking is entirely up to the existing scoreSkillResult logic. This is by design: we want fair scoring, not a forced override.

Also removed an unrelated .gitignore change from the commit.

When searching for a skill by its exact slug (e.g. 'ima-all-ai'), the result could be missing from the candidate pool because: 1. The vector search candidate pool (75-256) might not include the skill 2. matchesExactTokens uses `.some()` (by design, see openclaw#27), so unrelated skills with common tokens like 'ai' inflate exactMatches count 3. When exactMatches >= limit, lexicalFallbackSkills is skipped entirely, along with its exact slug lookup via the by_slug index Fix: after all recall stages (vector + exactMatches + lexicalFallback), reconstruct the candidate slug from query tokens and check if it exists in merged results. If missing, perform a single O(1) index lookup via the new `lookupExactSlug` internalQuery to ensure it enters the candidate pool. It then participates in normal scoring and ranking — SLUG_EXACT_BOOST (1.4) makes it very likely to rank first, but the fix does not force any particular ranking. Co-Authored-By: 戴硕 <daishuo@gmail.com>

…lug recall)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c39848cdc8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-20T07:30:14Z

convex/search.ts

+    // Ensure an exact slug match is always recalled, even when the vector
+    // candidate pool and lexical fallback both missed it.  The candidate
+    // slug is reconstructed from query tokens (e.g. "ima all ai" → "ima-all-ai").
+    const candidateSlug = queryTokens.join("-");


Preserve raw slug spelling when recalling exact slugs

Reconstructing the lookup key with queryTokens.join("-") loses valid slug spellings that publishVersionForUser still accepts in convex/lib/skillPublish.ts:85-89, notably repeated or trailing hyphens. For example, searching the exact slug foo--bar produces candidateSlug === "foo-bar", and foo- produces "foo", so this new fallback still misses the intended skill whenever the vector path saturates exactMatches and skips lexicalFallbackSkills. That means the fix does not actually restore exact-slug recall for part of the slug space the app allows today.

Useful? React with 👍 / 👎.

Slugs like foo--bar or foo- are technically accepted by the validation regex, but that is arguably a validation bug — not a recall bug. We sampled ~700 real slugs on ClawHub and found zero instances of double hyphens, trailing hyphens, or any pattern where tokenize().join("-") diverges from the stored slug. Not fixing this here.

chatgpt-codex-connector · 2026-03-20T07:30:14Z

convex/search.ts

+    const candidateSlug = queryTokens.join("-");
+    if (
+      /^[a-z0-9][a-z0-9-]*$/.test(candidateSlug) &&
+      !mergedMatches.some((e) => e.skill.slug === candidateSlug)


Gate the fallback lookup on the raw query being slug-shaped

This condition is effectively true for every non-empty search, because tokenize() in convex/lib/searchText.ts:1-10 only emits [a-z0-9]+ tokens and queryTokens.join("-") therefore always matches the slug regex. In practice, ordinary queries like weather forecast now incur an extra skills.by_slug lookup unless the merged results already contain weather-forecast, so the hot search path pays an additional Convex read on nearly every request instead of only on actual slug lookups.

Useful? React with 👍 / 👎.

This is intentional. A user searching weather forecast (with a space) should recall the skill with slug weather-forecast — that is the whole point of this fix. The by_slug index lookup is O(1) and only fires when the slug is absent from merged results, so the cost is negligible.

chatgpt-codex-connector bot reviewed Mar 17, 2026

View reviewed changes

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

oolong-tea-2026 force-pushed the fix/exact-slug-recall branch 2 times, most recently from df87b72 to 6e66ca6 Compare March 17, 2026 12:29

oolong-tea-2026 changed the title ~~fix: ensure exact slug match is always recalled in search~~ fix: recall exact slug match as search candidate Mar 17, 2026

chatgpt-codex-connector bot reviewed Mar 17, 2026

View reviewed changes

oolong-tea-2026 force-pushed the fix/exact-slug-recall branch from 6e66ca6 to c4660ff Compare March 17, 2026 12:43

merge: resolve conflicts with upstream main (formatting + our exact s…

c39848c

…lug recall)

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

Uh oh!

Conversation

oolong-tea-2026 commented Mar 17, 2026

Problem

Root Cause

Reproduction

Fix

Changes

Tests

Uh oh!

vercel bot commented Mar 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

oolong-tea-2026 Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 17, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

oolong-tea-2026 Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

oolong-tea-2026 Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

oolong-tea-2026 Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

oolong-tea-2026 Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant