You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`OPENALEPH_SEARCH_MATCH_SYMBOLS`|`false`| Name symbols (cross-language) |
35
+
36
+
Enabling more stages improves recall (finding more potential matches) at the cost of query complexity and performance. For most use cases, stages 1 and 2 provide sufficient matching quality.
37
+
38
+
```bash
39
+
# Enable all matching stages
40
+
export OPENALEPH_SEARCH_MATCH_NAME_PARTS=true
41
+
export OPENALEPH_SEARCH_MATCH_PHONETIC=true
42
+
export OPENALEPH_SEARCH_MATCH_SYMBOLS=true
43
+
```
44
+
26
45
## Name matching strategies
27
46
28
47
### 1. Normalized keywords
@@ -42,7 +61,10 @@ Normalization:
42
61
43
62
Exact name matches (with order preserved) receive the highest boost.
44
63
45
-
### 2. Name symbols
64
+
### 2. Name symbols {: #name-symbols }
65
+
66
+
!!! note
67
+
Disabled by default. Enable with `OPENALEPH_SEARCH_MATCH_SYMBOLS=true`.
46
68
47
69
Cross-language and cross-alphabet matching via symbolic representations. This can be considered as a synonyms search, but more precise and context specific than [a global synonyms file](https://www.elastic.co/docs/solutions/search/full-text/search-with-synonyms).
48
70
@@ -58,7 +80,10 @@ Example:
58
80
59
81
Same symbol = same entity name (part) across languages.
60
82
61
-
### 3. Phonetic encoding
83
+
### 3. Phonetic encoding {: #phonetic }
84
+
85
+
!!! note
86
+
Disabled by default. Enable with `OPENALEPH_SEARCH_MATCH_PHONETIC=true`.
62
87
63
88
Sound-alike matching using Double Metaphone algorithm.
64
89
@@ -72,7 +97,10 @@ Example:
72
97
73
98
Catches alternate spellings and transcription variations.
74
99
75
-
### 4. Name parts
100
+
### 4. Name parts {: #name-parts }
101
+
102
+
!!! note
103
+
Disabled by default. Enable with `OPENALEPH_SEARCH_MATCH_NAME_PARTS=true`.
76
104
77
105
Individual name components for partial matching.
78
106
@@ -143,16 +171,16 @@ Only compatible schema types can match each other.
143
171
144
172
Match scores combine multiple factors:
145
173
146
-
| Signal | Boost | Index field |
147
-
|--------|-------|-------------|
148
-
| Names (exact, order preserved) | 5.0 |`names`|
149
-
| Name keys (order-independent) | 3.0 |`name_keys`|
150
-
| Identifiers | 3.0 |`properties.*` (for group type "identifier") |
0 commit comments