You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The `expand` parameter controls whether to expand equivalent synonym rules.
93
+
Consider a synonym defined like:
94
+
95
+
`foo, bar, baz`
96
+
97
+
Using `expand: true`, the synonym rule would be expanded into:
113
98
114
-
With the above request the word `bar` gets skipped but a mapping `foo => baz` is still added. However, if the mapping
115
-
being added was `foo, baz => bar` nothing would get added to the synonym list. This is because the target word for the
116
-
mapping is itself eliminated because it was a stop word. Similarly, if the mapping was "bar, foo, baz" and `expand` was
117
-
set to `false` no mapping would get added as when `expand=false` the target mapping is the first word. However, if
118
-
`expand=true` then the mappings added would be equivalent to `foo, baz => foo, baz` i.e, all mappings other than the
119
-
stop word.
99
+
```
100
+
foo => foo
101
+
foo => bar
102
+
foo => baz
103
+
bar => foo
104
+
bar => bar
105
+
bar => baz
106
+
baz => foo
107
+
baz => bar
108
+
baz => baz
109
+
```
110
+
111
+
When `expand` is set to `false`, the synonym rule is not expanded and the first synonym is treated as the canonical representation. The synonym would be equivalent to:
112
+
113
+
```
114
+
foo => foo
115
+
bar => foo
116
+
baz => foo
117
+
```
118
+
119
+
The `expand` parameter does not affect explicit synonym rules, like `foo, bar => baz`.
@@ -153,12 +153,65 @@ Text will be processed first through filters preceding the synonym filter before
153
153
In the above example, text will be lowercased by the `lowercase` filter before being processed by the `synonyms_filter`.
154
154
This means that all the synonyms defined there needs to be in lowercase, or they won't be found by the synonyms filter.
155
155
156
-
The synonym rules should not contain words that are removed by a filter that appears later in the chain (like a `stop` filter).
157
-
Removing a term from a synonym rule means there will be no matching for it at query time.
158
-
159
156
Because entries in the synonym map cannot have stacked positions, some token filters may cause issues here.
160
157
Token filters that produce multiple versions of a token may choose which version of the token to emit when parsing synonyms.
161
158
For example, `asciifolding` will only produce the folded version of the token.
162
159
Others, like `multiplexer`, `word_delimiter_graph` or `ngram` will throw an error.
163
160
164
161
If you need to build analyzers that include both multi-token filters and synonym filters, consider using the <<analysis-multiplexer-tokenfilter,multiplexer>> filter, with the multi-token filters in one branch and the synonym filter in the other.
162
+
163
+
[discrete]
164
+
[[synonym-graph-tokenizer-stop-token-filter]]
165
+
===== Synonyms and `stop` token filters
166
+
167
+
Synonyms and <<analysis-stop-tokenfilter,stop token filters>> interact with each other in the following ways:
The stop filter will remove the terms from the resulting synonym expansion.
214
+
215
+
For example, a synonym rule like `foo, bar => baz` and a stop filter that removes `baz` will get no matches for `foo` or `bar`, as both would get expanded to `baz` which is removed by the stop filter.
216
+
217
+
If the stop filter removed `foo` instead, then searching for `foo` would get expanded to `baz`, which is not removed by the stop filter thus potentially providing matches for `baz`.
See <<synonym-tokenizer-expand-equivalent-synonyms,expand equivalent synonyms>>.
71
+
* `lenient` (defaults to `false`).
72
+
If `true` ignores errors while parsing the synonym configuration.
73
+
It is important to note that only those synonym rules which cannot get parsed are ignored.
74
+
See <<synonym-tokenizer-stop-token-filter,synonyms and stop token filters>> for an example of `lenient` behaviour for invalid synonym rules.
75
+
76
+
[discrete]
77
+
[[synonym-tokenizer-expand-equivalent-synonyms]]
78
+
===== `expand` equivalent synonym rules
79
+
80
+
The `expand` parameter controls whether to expand equivalent synonym rules.
81
+
Consider a synonym defined like:
82
+
83
+
`foo, bar, baz`
84
+
85
+
Using `expand: true`, the synonym rule would be expanded into:
102
86
103
-
With the above request the word `bar` gets skipped but a mapping `foo => baz` is still added. However, if the mapping
104
-
being added was `foo, baz => bar` nothing would get added to the synonym list. This is because the target word for the
105
-
mapping is itself eliminated because it was a stop word. Similarly, if the mapping was "bar, foo, baz" and `expand` was
106
-
set to `false` no mapping would get added as when `expand=false` the target mapping is the first word. However, if
107
-
`expand=true` then the mappings added would be equivalent to `foo, baz => foo, baz` i.e, all mappings other than the
108
-
stop word.
87
+
```
88
+
foo => foo
89
+
foo => bar
90
+
foo => baz
91
+
bar => foo
92
+
bar => bar
93
+
bar => baz
94
+
baz => foo
95
+
baz => bar
96
+
baz => baz
97
+
```
109
98
99
+
When `expand` is set to `false`, the synonym rule is not expanded and the first synonym is treated as the canonical representation. The synonym would be equivalent to:
100
+
101
+
```
102
+
foo => foo
103
+
bar => foo
104
+
baz => foo
105
+
```
106
+
107
+
The `expand` parameter does not affect explicit synonym rules, like `foo, bar => baz`.
110
108
111
109
[discrete]
112
110
[[synonym-tokenizer-ignore_case-deprecated]]
@@ -128,7 +126,7 @@ To apply synonyms, you will need to include a synonym token filters into an anal
128
126
"my_analyzer": {
129
127
"type": "custom",
130
128
"tokenizer": "standard",
131
-
"filter": ["lowercase", "synonym"]
129
+
"filter": ["stemmer", "synonym"]
132
130
}
133
131
}
134
132
----
@@ -140,15 +138,68 @@ To apply synonyms, you will need to include a synonym token filters into an anal
140
138
Order is important for your token filters.
141
139
Text will be processed first through filters preceding the synonym filter before being processed by the synonym filter.
142
140
143
-
In the above example, text will be lowercased by the `lowercase` filter before being processed by the `synonyms_filter`.
144
-
This means that all the synonyms defined there needs to be in lowercase, or they won't be found by the synonyms filter.
145
-
146
-
The synonym rules should not contain words that are removed by a filter that appears later in the chain (like a `stop` filter).
147
-
Removing a term from a synonym rule means there will be no matching for it at query time.
141
+
{es} will also use the token filters preceding the synonym filter in a tokenizer chain to parse the entries in a synonym file or synonym set.
142
+
In the above example, the synonyms token filter is placed after a stemmer. The stemmer will also be applied to the synonym entries.
148
143
149
144
Because entries in the synonym map cannot have stacked positions, some token filters may cause issues here.
150
145
Token filters that produce multiple versions of a token may choose which version of the token to emit when parsing synonyms.
151
146
For example, `asciifolding` will only produce the folded version of the token.
152
147
Others, like `multiplexer`, `word_delimiter_graph` or `ngram` will throw an error.
153
148
154
149
If you need to build analyzers that include both multi-token filters and synonym filters, consider using the <<analysis-multiplexer-tokenfilter,multiplexer>> filter, with the multi-token filters in one branch and the synonym filter in the other.
150
+
151
+
[discrete]
152
+
[[synonym-tokenizer-stop-token-filter]]
153
+
===== Synonyms and `stop` token filters
154
+
155
+
Synonyms and <<analysis-stop-tokenfilter,stop token filters>> interact with each other in the following ways:
The stop filter will remove the terms from the resulting synonym expansion.
202
+
203
+
For example, a synonym rule like `foo, bar => baz` and a stop filter that removes `baz` will get no matches for `foo` or `bar`, as both would get expanded to `baz` which is removed by the stop filter.
204
+
205
+
If the stop filter removed `foo` instead, then searching for `foo` would get expanded to `baz`, which is not removed by the stop filter thus potentially providing matches for `baz`.
Copy file name to clipboardExpand all lines: docs/reference/analysis/tokenfilters/synonyms-format.asciidoc
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ This format uses two different definitions:
15
15
ipod, i-pod, i pod
16
16
computer, pc, laptop
17
17
----
18
-
* Explicit mappings: Matches a group of words to other words. Words on the left hand side of the rule definition are expanded into all the possibilities described on the right hand side. Example:
18
+
* Explicit synonyms: Matches a group of words to other words. Words on the left hand side of the rule definition are expanded into all the possibilities described on the right hand side. Example:
Copy file name to clipboardExpand all lines: docs/reference/search/search-your-data/search-with-synonyms.asciidoc
+21Lines changed: 21 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -75,6 +75,27 @@ A large number of inline synonyms increases cluster size unnecessarily and can l
75
75
76
76
Once your synonyms sets are created, you can start configuring your token filters and analyzers to use them.
77
77
78
+
79
+
[WARNING]
80
+
======
81
+
Synonyms sets must exist before they can be added to indices.
82
+
If an index is created referencing a nonexistent synonyms set, the index will remain in a partially created and inoperable state.
83
+
The only way to recover from this scenario is to ensure the synonyms set exists then either delete and re-create the index, or close and re-open the index.
84
+
======
85
+
86
+
[WARNING]
87
+
======
88
+
Invalid synonym rules can cause errors when applying analyzer changes.
89
+
For reloadable analyzers, this prevents reloading and applying changes.
90
+
You must correct errors in the synonym rules and reload the analyzer.
91
+
92
+
An index with invalid synonym rules cannot be reopened, making it inoperable when:
93
+
94
+
* A node containing the index starts
95
+
* The index is opened from a closed state
96
+
* A node restart occurs (which reopens the node assigned shards)
97
+
======
98
+
78
99
{es} uses synonyms as part of the <<analysis-overview,analysis process>>.
79
100
You can use two types of <<analysis-tokenfilters,token filter>> to include synonyms:
Copy file name to clipboardExpand all lines: docs/reference/synonyms/apis/synonyms-apis.asciidoc
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,6 +24,23 @@ NOTE: Synonyms sets are limited to a maximum of 10,000 synonym rules per set.
24
24
Synonym sets with more than 10,000 synonym rules will provide inconsistent search results.
25
25
If you need to manage more synonym rules, you can create multiple synonyms sets.
26
26
27
+
WARNING: Synonyms sets must exist before they can be added to indices.
28
+
If an index is created referencing a nonexistent synonyms set, the index will remain in a partially created and inoperable state.
29
+
The only way to recover from this scenario is to ensure the synonyms set exists then either delete and re-create the index, or close and re-open the index.
30
+
31
+
[WARNING]
32
+
====
33
+
Invalid synonym rules can cause errors when applying analyzer changes.
34
+
For reloadable analyzers, this prevents reloading and applying changes.
35
+
You must correct errors in the synonym rules and reload the analyzer.
36
+
37
+
An index with invalid synonym rules cannot be reopened, making it inoperable when:
38
+
39
+
* A node containing the index starts
40
+
* The index is opened from a closed state
41
+
* A node restart occurs (which reopens the node assigned shards)
0 commit comments