Skip to content

search(filter=...) post-filters after pagination — NodeFilter not pushed into run_search #353

Description

@HumanBean17

Summary

search with a NodeFilter (role/module/microservice/capability/role_in/exclude_roles/capability_in) applies the filter only as a post-filter on rows already sliced by offset/limit, instead of pushing the predicates into the LanceDB query.

Evidence (confirmed)

mcp_v2.py:950 — the run_search(...) call passes query, uri, table_keys, hybrid, limit, offset, path_substring=path_contains, model_name, device, modelnone of the NodeFilter structural fields. They are applied afterward at :966-969 via _node_matches_filter on the already-paginated rows. (path_contains IS pushed down as path_substring, so the asymmetry is accidental.) run_search already accepts these params — they are simply not supplied.

Impact

search(query=…, filter={"role":"SERVICE"}, limit=5) fetches the top-5 globally-ranked chunks then drops non-SERVICE ones → may return 0–2 results even when many SERVICE chunks exist deeper in the ranking. Paginating a filtered search yields shrinking/inconsistent pages. The search description says filter "uses the same NodeFilter schema as find," implying real filtering.

Suggested fix

Forward role/module/microservice/capability/role_in/exclude_roles/capability_in from the NodeFilter into run_search(...), or over-fetch (like find's max(500, …)) before post-filtering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions