[WIP] Native engine abstractions#20821
[WIP] Native engine abstractions#20821bharath-techie wants to merge 6 commits intoopensearch-project:mainfrom
Conversation
| public CompositeEngine(List<SearchEnginePlugin> plugins, ShardPath shardPath) throws IOException { | ||
| Map<String, List<SearchExecEngine<?, ?>>> engines = new HashMap<>(); | ||
| for (SearchEnginePlugin plugin : plugins) { | ||
| SearchExecEngine<?, ?> engine = plugin.createSearchExecEngine(shardPath); |
There was a problem hiding this comment.
Reworking to avoid this in composite engine - we will do it in index shard. Here we'll just tie SPIs of backend plugins that listens to deletes and refreshes.
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 4aad14e.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Reviewer Guide 🔍(Review updated until commit 7f5f3e6)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 7f5f3e6 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 338bc6eSuggestions up to commit 4aad14e
|
|
❌ Gradle check result for 4aad14e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
4aad14e to
3e4d286
Compare
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 3e4d286.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
3e4d286 to
7ef7f9c
Compare
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 7ef7f9c.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
7ef7f9c to
ade5591
Compare
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit ade5591.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit ac2edae.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
| * @opensearch.internal | ||
| */ | ||
| public interface AnalyticsBackEndPlugin { | ||
| public interface AnalyticsBackEndPlugin extends SearchAnalyticsBackEndPlugin { |
There was a problem hiding this comment.
I think we need to keep these two things separate - the api in SearchAnalyticsBackEndPlugin isn't going to be used by the analytics engine and are largely used for registration/indexing with composite.
There was a problem hiding this comment.
Maybe we could have the same class extend and implement?
| * Create a search context. The reader is provided by {@link org.opensearch.index.engine.CompositeEngine} | ||
| * which owns all reader managers. | ||
| */ | ||
| C createContext( |
There was a problem hiding this comment.
i don't know that we need this exposed - this was my motive for the bridge interface as is. The bridge can manage the lifecycle of the engine specific context vs exposing it outside and is built from the given snapshot.
There was a problem hiding this comment.
We need to have a common view of catalog snapshot or reader of the common view exposed. So that + plan is what we always send to this method.
Mainly this enables filter delegates and in general all search actions to maintain their own contexts through the query lifecycle.
I'm okay with not exposing this as well as long as we are able to provide the context when we initialize the delegates.
There was a problem hiding this comment.
Here's roughly what i'm thinking - where snapshot below is replaced with the shared reader context.
try (CompositeEngine.ReleasableRef<CatalogSnapshot> snapshot = engine.acquireSnapshot()) {
EngineBridge<byte[], ? extends EngineResultStream, RelNode> bridge =
(EngineBridge<byte[], ? extends EngineResultStream, RelNode>) plugin.bridge(engine, snapshot.getRef());
byte[] converted = bridge.convertFragment(logicalFragment);
List<Object[]> rows = new ArrayList<>();
try (EngineResultStream resultStream = bridge.execute(converted)) {
...
}
}
}The bridge interface (poorly named) was intended to be the point in time searcher specific to the back-end, built from the shared snapshot/reader. Which is basically this one, so it makes to replace it but thinking it should be the responsibility of AnalyticsBackEndPlugin to vend it on demand from the point in time state/reader managed by the composite engine. Sort of like CompositeEngine's getSearchExecEngine, but rather than requiring the CompositeEngine to maintain that its only worrying about refreshing readers behind the scenes and vending the composite reader.
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit cc1fb42.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 86a9747.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 5f761ba.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
| * @opensearch.internal | ||
| */ | ||
| public interface AnalyticsBackEndPlugin { | ||
| public interface AnalyticsBackEndPlugin extends SearchAnalyticsBackEndPlugin { |
| return readerManagers.get(format); | ||
| } | ||
|
|
||
| public SearchExecEngine<?, ?> getSearchExecEngine(DataFormat format) throws IOException { |
There was a problem hiding this comment.
when we search from the analytics-engine, we're going to end up with a fragment that is assigned to a particular engine from query planning. We'll need to fetch by engine name, not dataformat - probably just as simple as maintaining a reader by name map.
There was a problem hiding this comment.
but i think the back-ends can build the searcher / searchExecEngine on the fly, vs having to maintain this state. We'd just need to fetch the reader to build searcher from?
| * Create a search context. The reader is provided by {@link org.opensearch.index.engine.CompositeEngine} | ||
| * which owns all reader managers. | ||
| */ | ||
| C createContext( |
There was a problem hiding this comment.
Here's roughly what i'm thinking - where snapshot below is replaced with the shared reader context.
try (CompositeEngine.ReleasableRef<CatalogSnapshot> snapshot = engine.acquireSnapshot()) {
EngineBridge<byte[], ? extends EngineResultStream, RelNode> bridge =
(EngineBridge<byte[], ? extends EngineResultStream, RelNode>) plugin.bridge(engine, snapshot.getRef());
byte[] converted = bridge.convertFragment(logicalFragment);
List<Object[]> rows = new ArrayList<>();
try (EngineResultStream resultStream = bridge.execute(converted)) {
...
}
}
}The bridge interface (poorly named) was intended to be the point in time searcher specific to the back-end, built from the shared snapshot/reader. Which is basically this one, so it makes to replace it but thinking it should be the responsibility of AnalyticsBackEndPlugin to vend it on demand from the point in time state/reader managed by the composite engine. Sort of like CompositeEngine's getSearchExecEngine, but rather than requiring the CompositeEngine to maintain that its only worrying about refreshing readers behind the scenes and vending the composite reader.
Signed-off-by: bharath-techie <bharath78910@gmail.com>
Signed-off-by: bharath-techie <bharath78910@gmail.com>
Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
* Refactor CompositeEngine to use factory Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * Introduce SegmentCollector Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * Introduce SegmentCollector Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * Introduce SegmentCollector Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> --------- Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
* Refactor CompositeEngine to use factory Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * Introduce SegmentCollector Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * Introduce SegmentCollector Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * Introduce SegmentCollector Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * De-couple and simplify index file deleter Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * De-couple and simplify index file deleter Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * De-couple and simplify index file deleter, handle scorer and weight query lifecycle Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * De-couple and simplify index file deleter, handle scorer and weight query lifecycle Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> * De-couple and simplify index file deleter, handle scorer and weight query lifecycle Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> --------- Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
5f761ba to
338bc6e
Compare
|
Persistent review updated to latest commit 338bc6e |
|
❌ Gradle check result for 338bc6e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: bharath-techie <bharath78910@gmail.com>
| testingConventions.enabled = false | ||
|
|
||
| // analytics-framework does not depend on server | ||
| // analytics-framework depends on server for SearchAnalyticsBackEndPlugin SPI |
There was a problem hiding this comment.
Do we let the bridge talk to the engine ? if we need to remove this dependency.
We can come back to this after we figure out if context is required in analytics plugin for delegates.
|
Persistent review updated to latest commit 7f5f3e6 |
|
❌ Gradle check result for 7f5f3e6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
| * | ||
| * @param readerManagers the per-format reader managers that receive notifications | ||
| */ | ||
| public CatalogSnapshotLifecycleListener createCatalogSnapshotListener(Map<DataFormat, EngineReaderManager<?>> readerManagers) { |
There was a problem hiding this comment.
where will this be called? from the composite engine?
Description
reader managerwhich ties the index / catalog snapshot lifecycle with the backend engine and plugins and in future it will have ties to cache etcRelated Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.