Skip to content

Add Amharic search normalization for improved recall#4

Open
flavnat wants to merge 1 commit intoliulalemx:stagingfrom
flavnat:feature/search-normalization
Open

Add Amharic search normalization for improved recall#4
flavnat wants to merge 1 commit intoliulalemx:stagingfrom
flavnat:feature/search-normalization

Conversation

@flavnat
Copy link

@flavnat flavnat commented Jan 1, 2026

Summary

This PR integrates Amharic search normalization logic into the toolkit. It enhances search recall by handling common spelling variations in Amharic.

Changes

  • New Feature: Added Search class in src/search.ts with normalize() method.
    • Homophone Normalization: Standardizes characters with same/similar sounds (e.g., ሐ, ኀ -> ሀ; ሠ -> ሰ).
    • Labialized Normalization: Aligns labialized sequences (e.g., ፍዋ -> ፏ).
  • Configuration: Updated tsconfig.json module setting to NodeNext to fix build errors.
  • Tests: Added tests/search.test.js covering homophone and labialized cases.

Checklist

  • Build passes (pnpm run build)
  • Tests pass (pnpm test)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant