-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Milestone
Description
Description:
Implement the exact string matching layer with detailed metadata tracking and confidence scoring for the cooking domain.
Requirements:
- Case-insensitive exact string matching with defensive confidence scoring
- Near-miss detection using Levenshtein distance (≤2 character differences) [https://en.wikipedia.org/wiki/Levenshtein_distance]
- Comprehensive metadata tracking (MatchMetadata, MatchResult classes)
- Confidence penalties for case/whitespace differences
- Domain-agnostic base class with cooking domain concrete implementation
Acceptance Criteria:
- DirectMatcher base class implemented
- CookingDirectMatcher with ingredient/equipment/technique matching
- Near-miss detection with configurable thresholds
- Metadata tracking includes transformation details and match quality indicators
- Unit tests covering edge cases (typos, whitespace, case variations)
- Performance benchmarks for large ingredient/equipment lists
- Documentation with cooking domain examples
Technical Notes:
- Follow the BaseExtractor/BaseMatcher pattern from existing codebase
- Implement confidence scoring: 1.0 for perfect matches, 0.95 for case differences, 0.8 for near-misses
- Use existing MatchResult and MatchMetadata structures
Metadata
Metadata
Assignees
Labels
No labels