Skip to content

Conversation

@rolandwalker
Copy link
Contributor

Description

  • Rename parameter orig_text and don't overwrite text.
  • Limit regex fuzzy match to 3-character intervening spans for performance and readability of results.
  • Add underscore-split match. If the underscore-split words in the text are a subset of the underscore-split words in the completion item, we have a match.
  • Add CamelCase-split match. If the CamelCase-split words in the text are a subset of the CamelCase-split words in the completion item, we have a match.
  • Remove unused length and position values from tuples, letting "completions" be just a list of strings.

The words within the underscore and camel-split matches are not themselves fuzzy, and must be exact matches. It might be neat if we accepted leading substrings from all of the words instead.

Beyond that, if we need anything more fancy we should use a library rather than rolling our own.

This may close some open issues.

Will cause a merge conflict with #1448, to be resolved later.

Like #1448, the inspiration is to improve completions after adding more possible matches in #1447.

Example showing underscore-split match:
last image

Checklist

  • I've added this contribution to the changelog.md.
  • I've added my name to the AUTHORS file (or it's already there).
  • I ran uv run ruff check && uv run ruff format && uv run mypy --install-types . to lint and format the code.

@rolandwalker rolandwalker self-assigned this Jan 19, 2026
@scottnemes
Copy link
Contributor

scottnemes commented Jan 19, 2026

RE: the split match functionality, I am comparing it to main and it seems to work the same as far as I can tell. Is there another case that highlights the expected difference?

image image

@rolandwalker
Copy link
Contributor Author

last image

@scottnemes OK, the above is a better example. What main does currently is check every letter, in order, with any number of intervening characters. Since z.*?o.*?n.*?e matches zone, you get a completion candidate. This is inefficient and leads to VERY many spurious matches.

The underscore part of the proposal here is to split terms on underscores, and match if every complete underscore-separated subword in the typed text is present in the candidate, independent of order. (But not every subword in the candidate needs to be covered, just some subset.)

A further proposal for a followup PR would be to accept leading substrings, rather than requiring complete subwords, so that sec_zon would match the candidate time_zone_leap_seconds.

A further proposal for a followup PR would be to adopt some other library for fuzzy matching, as many smart people have already thought about this problem.

@rolandwalker rolandwalker force-pushed the RW/fuzzy-matching-tuneup branch from 6df8833 to 22c71b1 Compare January 20, 2026 11:01
 * Rename parameter "orig_text" and don't overwrite "text".
 * Limit regex fuzzy match to 3-character intervening spans for
   performance and readability of results.
 * Add underscore-split match.  If the underscore-split words in the
   text are a subset of the underscore-split words in the completion
   item, we have a match.
 * Add CamelCase-split match.  If the CamelCase-split words in the
   text are a subset of the CamelCase-split words in the completion
   item, we have a match.
 * Remove unused length and position values from tuples, letting
   "completions" be just a list of strings.

The words within the underscore and camel-split matches are not
themselves fuzzy, and must be exact matches.  It might be neat if we
accepted leading substrings from all of the words instead.

Beyond that, if we need anything more fancy we should use a library
rather than rolling our own.
@rolandwalker rolandwalker force-pushed the RW/fuzzy-matching-tuneup branch from 22c71b1 to 81b30fa Compare January 20, 2026 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants