-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Provide access to new settings for HyphenationCompoundWordTokenFilter #115585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
peter-strsr
merged 5 commits into
elastic:main
from
pstrsr:97849_provide_access_to_new_settings_for_hyphenation_compound_word_token_filter
Nov 18, 2024
Merged
Changes from 3 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
97b431e
Allow the new flags added in Lucene in the HyphenationCompoundWordTok…
pstrsr 4567a17
Update docs
pstrsr edb18b5
Update changelog
pstrsr b1d552e
Review feedback: Clarfying test comments to indicate this is only an …
pstrsr 7bd9926
Merge branch 'main' into 97849_provide_access_to_new_settings_for_hyp…
elasticmachine File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
pr: 115459 | ||
summary: Adds access to flags no_sub_matches and no_overlapping_matches to hyphenation-decompounder-tokenfilter | ||
area: Search | ||
type: enhancement | ||
issues: | ||
- 97849 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that the word list being used or is it this: [fuss, fussball, ballpumpe, ball, pumpe, kaffee, fee, maschine]. I was thrown off by the comment but had trouble tracking that through in my head. Same thing on the comment on the subsequent test. The test result makes sense to me and looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, the wordlist contains
["fuss", "fussball", "ballpumpe", "ball", "pumpe", "kaffee", "fee", "maschine"]
, as defined intest1.json:43
. The comment should highlight, that this parameter should solve this specific problem of preventing the match of "fee" (fairy) within "kaffee" (coffee).I left in the same wordlist for all tests and input text to ensure that they are not any unintended side effect.
If it's clearer I could isolate the tests and only include the Kaffeemaschine related words in this test and only the Fussballpumpe in the other one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha I'm tracking now; this comment was a "for example". So I'll just nit (change it if you want). I'd just include before the comment something like "for example given a word list of: " ... that way it's clear that the test is validating more than just that word list.