Skip to content

feat: improve lesson search with fuzzy matching and highlighting#39

Merged
jt14den merged 4 commits intoUC-OSPO-Network:mainfrom
gaurirathi:fuzzy-search
Feb 7, 2026
Merged

feat: improve lesson search with fuzzy matching and highlighting#39
jt14den merged 4 commits intoUC-OSPO-Network:mainfrom
gaurirathi:fuzzy-search

Conversation

@gaurirathi
Copy link
Contributor

This PR improves the lesson search by replacing exact string matching with fuzzy search using Fuse.js.

What was done

  • Used Fuse.js for client-side fuzzy search (no API keys needed, as suggested in the issue)
  • Indexed lesson fields: name, description, keywords, and subTopic
  • Added typo tolerance so small spelling mistakes still return results
  • Kept existing filters (OSS role, educational level, pathway) working as before
  • Added a simple no-results message with an option to clear filters
  • Added basic highlighting for matched text in lesson titles
  • Tested locally with the full lesson dataset to ensure search performance is still responsive

Notes

  • Fuse.js search is synchronous, so no loading state was required

Fixes #31

@jt14den
Copy link
Collaborator

jt14den commented Jan 20, 2026

Nice implementation of fuzzy search! The Fuse.js integration looks clean and the highlighting is a great UX improvement.

⚠️ Security Concern

Line 91 in LessonCard.jsx uses dangerouslySetInnerHTML to render highlighted search results:

dangerouslySetInnerHTML={{
  __html: highlightText(
    lesson.name || 'Untitled Lesson',
    matches,
    "name"
  ),
}}

While the data currently comes from your Google Sheets (not user input), using dangerouslySetInnerHTML creates a potential XSS vulnerability if:

  • The Google Sheets data is ever compromised
  • User-generated content is added in the future
  • The data flow changes

Recommended fix:

Replace dangerouslySetInnerHTML with a safer approach:

function HighlightedText({ text, matches, field }) {
  if (!matches || !text) return <>{text}</>;

  const match = matches.find(m => m.key === field);
  if (!match) return <>{text}</>;

  const parts = [];
  let lastIndex = 0;

  match.indices.forEach(([start, end], i) => {
    // Add non-highlighted text
    if (start > lastIndex) {
      parts.push(<span key={`text-${i}`}>{text.slice(lastIndex, start)}</span>);
    }
    // Add highlighted text
    parts.push(<mark key={`mark-${i}`}>{text.slice(start, end + 1)}</mark>);
    lastIndex = end + 1;
  });

  // Add remaining text
  if (lastIndex < text.length) {
    parts.push(<span key="text-end">{text.slice(lastIndex)}</span>);
  }

  return <>{parts}</>;
}

// Then use it:
<h3 style={{...}}>
  <HighlightedText text={lesson.name || 'Untitled Lesson'} matches={matches} field="name" />
</h3>

This achieves the same visual result without the security risk.

✅ Everything Else Looks Good

  • Clean Fuse.js configuration with appropriate weights
  • Good use of useMemo for performance
  • Reasonable threshold (0.4) for typo tolerance
  • Maintains existing filter functionality

Once the security concern is addressed, this will be great to merge!

@github-actions
Copy link

❌ PR checks failed

One or more validation checks failed. Please review the workflow logs to see what went wrong.

Common issues:

  • 📊 Data validation: Google Sheets CSV is unreachable or has invalid data
  • 🔍 TypeScript check: Type errors in code
  • 🏗️ Build: Build process failed
  • 📄 Critical pages: Missing required pages (index, lessons, pathways)
  • 🔗 Internal links: Broken links detected

@gaurirathi
Copy link
Contributor Author

gaurirathi commented Jan 21, 2026

Thanks! I addressed the security concern.

@github-actions
Copy link

✅ All PR checks passed!

Check Status
📊 Data validation ✅ Pass
🔍 TypeScript check ✅ Pass
🏗️ Build ✅ Pass
📄 Critical pages ✅ Pass
🔗 Internal links ✅ Pass

The site builds successfully and all validation checks passed.

Copy link

@ShouzhiWang ShouzhiWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding fuzzy search and addressing the security issue. The search logic itself is a great improvement over strict matching.

However, the visual highlighting implementation causes some UIUX issues. Because fuzzy search matches non-contiguous characters, we often end up with "random" letters highlighted (e.g., searching "python" highlights individual letters 'p', 'y', 't' scattered across words), which looks buggy and creates visual noise.

Image

Suggestion: remove the visual highlighting feature but keep the fuzzy search filtering logic.

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

✅ All PR checks passed!

Check Status
📊 Data validation ✅ Pass
🔍 TypeScript check ✅ Pass
🏗️ Build ✅ Pass
📄 Critical pages ✅ Pass
🔗 Internal links ✅ Pass

The site builds successfully and all validation checks passed.

jt14den added a commit that referenced this pull request Feb 7, 2026
@jt14den jt14den merged commit 845b9b3 into UC-OSPO-Network:main Feb 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENHANCEMENT: Improve search with fuzzy matching

3 participants