|
| 1 | +# English to Arabic Translation Search |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This feature adds the ability to search the Quran using English words (and their synonyms), which are translated to Arabic roots for searching. The implementation leverages the library's existing root search capability. |
| 6 | + |
| 7 | +## Data Structure |
| 8 | + |
| 9 | +```typescript |
| 10 | +interface EnglishArabicConcept { |
| 11 | + english: string[]; // All English variants (synonyms) in one field |
| 12 | + arabic: string[]; // Arabic roots only - library handles the rest |
| 13 | +} |
| 14 | +``` |
| 15 | + |
| 16 | +### Example |
| 17 | + |
| 18 | +```json |
| 19 | +{ |
| 20 | + "english": ["truth", "verity", "trueness", "accuracy"], |
| 21 | + "arabic": ["حق", "صدق"] |
| 22 | +} |
| 23 | +``` |
| 24 | + |
| 25 | +### Why This Structure Works |
| 26 | + |
| 27 | +1. **Reduced Data Complexity** - Group all English synonyms in one field instead of duplicating entries |
| 28 | +2. **Leverage Existing Features** - Store only roots in `arabic` field; the library's built-in root search automatically finds all derived words |
| 29 | +3. **Consistent with Phonetic Flow** - English translation lookup runs at the same point as phonetic, creating a unified non-Arabic query pipeline |
| 30 | + |
| 31 | +## How It Works |
| 32 | + |
| 33 | +### Query Processing Flow |
| 34 | + |
| 35 | +``` |
| 36 | +User Query: "truth verity" |
| 37 | + ↓ |
| 38 | +Split into tokens: ["truth", "verity"] |
| 39 | + ↓ |
| 40 | +For each token: |
| 41 | + ↓ |
| 42 | +┌─────────────────────────────────────┐ |
| 43 | +│ isArabic(token)? │ |
| 44 | +├──────────────┬──────────────────────┤ |
| 45 | +│ YES │ NO │ |
| 46 | +│ ↓ │ ↓ │ |
| 47 | +│ Pass │ 1. English→Arabic │ |
| 48 | +│ through │ (NEW) │ |
| 49 | +│ unchanged │ 2. Phonetic │ |
| 50 | +│ │ (existing) │ |
| 51 | +└──────────────┴──────────────────────┘ |
| 52 | + ↓ |
| 53 | +If English found in translation map: |
| 54 | + "truth" → ["حق", "صدق"] |
| 55 | + "verity" → ["حق", "صدق"] |
| 56 | + ↓ |
| 57 | +Dedupe and combine: ["حق", "صدق"] |
| 58 | + ↓ |
| 59 | +Run search with root:true enabled |
| 60 | +Library automatically finds: |
| 61 | + - "الحق", "بالحق", "حقا" (from root "حق") |
| 62 | + - "صدق", "صدقا", "بالصدق" (from root "صدق") |
| 63 | +``` |
| 64 | + |
| 65 | +## Implementation Details |
| 66 | + |
| 67 | +### Integration Point |
| 68 | + |
| 69 | +The English→Arabic translation is integrated at [search.ts#L158](file:///src/core/search.ts#L158), right before the existing phonetic lookup: |
| 70 | + |
| 71 | +```typescript |
| 72 | +if (!isArabic(token)) { |
| 73 | + const cleanToken = token.toLowerCase().trim(); |
| 74 | + |
| 75 | + // 1. English → Arabic translation (NEW) |
| 76 | + let arabicRoots = englishArabicMap.get(cleanToken); |
| 77 | + if (arabicRoots) { |
| 78 | + return arabicRoots[0]; // Use roots, library handles rest |
| 79 | + } |
| 80 | + |
| 81 | + // 2. Phonetic fallback (EXISTING) |
| 82 | + let arabicPossibilities = phoneticMap.get(cleanToken); |
| 83 | + // ... |
| 84 | +} |
| 85 | +``` |
| 86 | + |
| 87 | +### When Does Translation Run? |
| 88 | + |
| 89 | +The English translation lookup **only runs when the query is NOT Arabic**: |
| 90 | + |
| 91 | +```typescript |
| 92 | +if (token && !isArabic(token)) { |
| 93 | + // Translation or Phonetic lookup happens here |
| 94 | +} |
| 95 | +``` |
| 96 | + |
| 97 | +This means: |
| 98 | +- Arabic queries → Direct search (no translation) |
| 99 | +- English/Latin queries → English translation lookup first, then phonetic fallback |
| 100 | + |
| 101 | +## Usage |
| 102 | + |
| 103 | +```typescript |
| 104 | +import { search } from 'quran-search-engine'; |
| 105 | + |
| 106 | +// Search using English word |
| 107 | +const result = search('truth', quranData, morphologyMap, wordMap, { |
| 108 | + lemma: true, |
| 109 | + root: true, // Important: enables root-based search |
| 110 | + semantic: true |
| 111 | +}); |
| 112 | + |
| 113 | +// The library will: |
| 114 | +// 1. Look up "truth" in English→Arabic map |
| 115 | +// 2. Find roots ["حق", "صدق"] |
| 116 | +// 3. Search for all words derived from these roots |
| 117 | +// 4. Return matching verses |
| 118 | +``` |
| 119 | + |
| 120 | +## Comparison: Phonetic vs English Translation |
| 121 | + |
| 122 | +| Feature | Phonetic | English Translation | |
| 123 | +|---------|----------|---------------------| |
| 124 | +| Input | Latin letters mimicking Arabic pronunciation | English words | |
| 125 | +| Example | "bismillah" → "بسم الله" | "truth" → ["حق", "صدق"] | |
| 126 | +| Type | Transliteration (sound-based) | Translation (meaning-based) | |
| 127 | +| Data | Pre-computed phonetic mappings | English synonyms → Arabic roots | |
| 128 | + |
| 129 | +## Future Enhancements |
| 130 | + |
| 131 | +- Add `category` field for filtering semantic groups |
| 132 | +- Support for English phrase mappings |
| 133 | +- Integration with existing semantic search for concept expansion |
0 commit comments