You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allow the new flags added in Lucene in the HyphenationCompoundWordTokenFilter
Adds access to the two new flags no_sub_matches and no_overlapping_matches.
Lucene issue: apache/lucene#9231
Co-authored-by: Peter Straßer <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Copy file name to clipboardExpand all lines: docs/reference/analysis/tokenfilters/hyphenation-decompounder-tokenfilter.asciidoc
+12Lines changed: 12 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -111,6 +111,18 @@ output. Defaults to `5`.
111
111
(Optional, Boolean)
112
112
If `true`, only include the longest matching subword. Defaults to `false`.
113
113
114
+
`no_sub_matches`::
115
+
(Optional, Boolean)
116
+
If `true`, do not match sub tokens in tokens that are in the word list.
117
+
Defaults to `false`.
118
+
119
+
`no_overlapping_matches`::
120
+
(Optional, Boolean)
121
+
If `true`, do not allow overlapping tokens.
122
+
Defaults to `false`.
123
+
124
+
Typically users will only want to include one of the three flags as enabling `no_overlapping_matches` is the most restrictive and `no_sub_matches` is more restrictive than `only_longest_match`. When enabling a more restrictive option the state of the less restrictive does not have any effect.
Copy file name to clipboardExpand all lines: modules/analysis-common/src/main/java/org/elasticsearch/analysis/common/HyphenationCompoundWordTokenFilterFactory.java
0 commit comments