Draft
Conversation
moshaad7
commented
Oct 10, 2023
| const ( | ||
| TokensAnalyzerType = "token" | ||
| HookTokensAnalyzerType = "hook_token" | ||
| VectorAnalyzerType = "vector" |
Contributor
Author
There was a problem hiding this comment.
move vector related stuff to a separate file with "vector" build tag
moshaad7
commented
Oct 10, 2023
| Err error | ||
| } | ||
|
|
||
| func AnalyzeForTokens(analyzer Analyzer, input []byte) (TokenStream, error) { |
Contributor
Author
There was a problem hiding this comment.
add comment:
// A utility function, helpful for analyzing an input to generate TokenStream ( and error, if any )
Previously, Analyze() method of an analyzer to return TokenStream.
But as per the change in this PR, Analyze() method will now return a value of type interface{}.
( Validating and using it can be done based on analyzer.Type() )
Thus, For the benefit of users of old Analyzer interface, this utiity will come handly , to migrate to new Analyzer interface.
moshaad7
commented
Oct 10, 2023
| analyzerType := analyzer.Type() | ||
| if analyzerType != TokensAnalyzerType && | ||
| analyzerType != HookTokensAnalyzerType { | ||
| return nil, fmt.Errorf("cannot analyze text with analyzer of type: %s", |
Contributor
Author
There was a problem hiding this comment.
alternate error msg: "given analyzer is not compatible to be used as a token analyzer"
- While analyzing a doc, analysis of few fields can fail. - We want to index the part of doc for which analysis succeeded.
39ff270 to
1cc5a32
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Aim is to let embedder register analyzers in bleve, at run time.
These registered analyzers can then be specified in the index mapping as analyzers for fields.
change log
new Analyzer interface
updates in Field interface
New Registry to store embedder submitted analysis hooks
update analyzer registry to also hold analyzers created using hooks
Related changes: