Feat/Fix: Fix/Improve search tool #307
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problems
Search was strict substring match on entire query string - word order and exact spacing mattered, any typo broke matches. Examples: "pdf merge" didn't match "Merge PDF", "base 64" didn't match "Base64 Encoder/Decoder".
Keywords, description and short description were not being included in the search (although it seems that the intention was they should be)
Changes
• Token-based matching: Query is normalised (trimmed, lowercased, split on whitespace), then each token must appear as substring in tool's searchable text (name, description, shortDescription, keywords). Order no longer matters.
• Fuzzy matching with Levenshtein distance: Tokens within edit distance 1 are considered weak matches, so small typos still find tools.
• Relevance scoring: Tools ranked by weighted score (name matches weighted higher than description/keywords), exact matches weighted higher than fuzzy.
• Token variants: Handles patterns like "base 64" → "base64" by generating merged token variants.
Examples that now work
Keyword cleanup
While testing, cleaned up keywords in various tool meta files - removed duplicates, fixed inconsistencies, added missing synonyms (e.g. "b64" for base64, "join" for merge-pdf).
Tests
Added src/tools/index.test.ts with comprehensive tests covering token matching, case insensitivity, keyword synonyms, typo tolerance, and ranking behaviour.
Manually tested performance and the search isn't noticeably slower.
Comments
Keywords are still hardcoded as english but after this change it would be trivial to extract keywords to a single i18n string of search terms.