initial commit of analytics engine plugin to sandbox#20697
initial commit of analytics engine plugin to sandbox#20697mch2 merged 23 commits intoopensearch-project:mainfrom
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 937ad71.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
There was a problem hiding this comment.
Thanks for the initial draft Marc, looks like a good start
The one high-level concern I have is that one single plugin is responsible for planning, orchestration, substrait conversion etc would become unwieldy overtime, espl when we bring in plan optimisers and orchestrators(stage-based).
Would it make sense to tease out planner/optimisers into a plugin and the maybe orchestrators as yet another while letting server have the stitching logic across these plugins. Also I am not sure if we need the ExecutionEngine to be invoked from the extensible plugin(avoiding the SPI complexity for native execution engine)
Still trying this out, my thinking with this is to keep it self contained (at least while we're in sandbox) & opt in so we aren't bloating server for users who don't use this plugin. Perhaps we can move planner/optimizer to /libs but it will also carry the calcite dependency with it.
The query engine is going to have fragments we'll need to send to the native engine for both data nodes and coordinators. So we can wire it via SPI, or through pluggable interface in core but I think the driver should remain in the plugin itself. |
|
Ok gave this a shot and I think it may work out. Ultimately we have a query-planner plugin, the engine plugin (implementing ExecutionEnginePlugin), and a module/ for the rest (what used to be extensible-engine) - thinking about pulling what i can out to a lib there as well, but this should remove the direct call from the engine into the native executor. |
Thanks Marc for the change, looks awesome. The plugin segregations across planner and executor look pretty neat(thanks for moving the executor to modules to simplify plugin installation for various FE lang). Overall love the simplification while balancing extensibility. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 3190e36.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Reviewer Guide 🔍(Review updated until commit aef649d)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to aef649d Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 9acc4b2
Suggestions up to commit 259df47
Suggestions up to commit fa4ebf8
Suggestions up to commit 3190e36
|
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit e921691.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
needs some clean up still - but updated the interfaces here @Bukhtawar. and in future: |
...nalytics-frontend-ppl/src/main/java/org/opensearch/fe/ppl/planner/DefaultEngineExecutor.java
Outdated
Show resolved
Hide resolved
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 4ae97ca.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
@mch2 Annoying high level comment, but how about some README.md files in each new component's root directory to give some context about what each thing does and how it fits in? |
|
@Bukhtawar For now i've separated this into front-end/back-end/executor, with a shared lib and plugin "hub" to wire in shared components. As for query planning, everything planning related in the ppl plugin is there to allow the UQP to build plans for our executor. The Executor itself will do its own planning rbo/cbo,scheduling that will be reused across front-ends |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 0084490.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit fa4ebf8.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
Persistent review updated to latest commit fa4ebf8 |
|
❌ Gradle check result for fa4ebf8: Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
...s/analytics-framework/src/main/java/org/opensearch/analytics/spi/AnalyticsBackEndPlugin.java
Show resolved
Hide resolved
...box/plugins/analytics-engine/src/main/java/org/opensearch/analytics/BaseAnalyticsPlugin.java
Outdated
Show resolved
Hide resolved
...ox/libs/analytics-framework/src/main/java/org/opensearch/analytics/backend/EngineBridge.java
Outdated
Show resolved
Hide resolved
...gins/analytics-frontend-ppl/src/main/java/org/opensearch/fe/action/RestUnifiedPPLAction.java
Outdated
Show resolved
Hide resolved
.../analytics-framework/src/main/java/org/opensearch/analytics/spi/AnalyticsFrontEndPlugin.java
Outdated
Show resolved
Hide resolved
...s/analytics-framework/src/main/java/org/opensearch/analytics/backend/EngineCapabilities.java
Outdated
Show resolved
Hide resolved
...s/analytics-framework/src/main/java/org/opensearch/analytics/backend/EngineCapabilities.java
Outdated
Show resolved
Hide resolved
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 5d9bc23.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
.../internalClusterTest/java/org/opensearch/fe/planner/unified/ClickBenchUnifiedPipelineIT.java
Outdated
Show resolved
Hide resolved
sandbox/plugins/analytics-engine/src/test/java/fe/ppl/action/TestPPLTransportActionTests.java
Outdated
Show resolved
Hide resolved
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #20697 +/- ##
=========================================
Coverage 73.33% 73.34%
- Complexity 72306 72338 +32
=========================================
Files 5796 5801 +5
Lines 330263 330341 +78
Branches 47663 47672 +9
=========================================
+ Hits 242189 242273 +84
+ Misses 68660 68623 -37
- Partials 19414 19445 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
wraps this and schema in an engineContext provided to front-ends Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
andrross
left a comment
There was a problem hiding this comment.
Looks good as the initial framework to start building this all out in the sandbox
Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Marc Handalian <handalm@amazon.com>
|
❌ Gradle check result for 90e55df: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 90e55df: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 90e55df: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…ect#20697) * initial commit of extensible query engine plugin to sandbox Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle - update forbidden-dependencies to skip guava check in sandbox plugins, calcite requires this dependency at compile time Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Rename plugin interfaces and default implementations. Wire up a ppl front-end using UnifiedQueryAPI from sql plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * refactor to plugin-plugin SPI Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * add readmes and start some clean up. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * analyzer errors Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * move fe plugin into analytics plugin for testing only, we will use sql plugin. also remove "hub" plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * spotless Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * more clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fixing analyzer issues Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava forbidden check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix license check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc warning on transitive dependency. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle and fix weird javadoc issues with dependencies. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix calcite/guava dependencies Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix package name Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * remove EngineCapabilities, just use calcite's sqloperatortable. wraps this and schema in an engineContext provided to front-ends Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * simplify unified IT to use params Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava NOTICE file to exactly match the file from grpc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * javadoc fix Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Update sandbox/plugins/analytics-engine/README.md Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Marc Handalian <handalm@amazon.com> --------- Signed-off-by: Marc Handalian <marc.handalian@gmail.com> Signed-off-by: Marc Handalian <handalm@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: shayush622 <ayush5267@gmail.com>
…ect#20697) * initial commit of extensible query engine plugin to sandbox Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle - update forbidden-dependencies to skip guava check in sandbox plugins, calcite requires this dependency at compile time Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Rename plugin interfaces and default implementations. Wire up a ppl front-end using UnifiedQueryAPI from sql plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * refactor to plugin-plugin SPI Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * add readmes and start some clean up. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * analyzer errors Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * move fe plugin into analytics plugin for testing only, we will use sql plugin. also remove "hub" plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * spotless Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * more clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fixing analyzer issues Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava forbidden check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix license check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc warning on transitive dependency. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle and fix weird javadoc issues with dependencies. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix calcite/guava dependencies Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix package name Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * remove EngineCapabilities, just use calcite's sqloperatortable. wraps this and schema in an engineContext provided to front-ends Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * simplify unified IT to use params Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava NOTICE file to exactly match the file from grpc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * javadoc fix Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Update sandbox/plugins/analytics-engine/README.md Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Marc Handalian <handalm@amazon.com> --------- Signed-off-by: Marc Handalian <marc.handalian@gmail.com> Signed-off-by: Marc Handalian <handalm@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: shayush622 <ayush5267@gmail.com>
…ect#20697) * initial commit of extensible query engine plugin to sandbox Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle - update forbidden-dependencies to skip guava check in sandbox plugins, calcite requires this dependency at compile time Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Rename plugin interfaces and default implementations. Wire up a ppl front-end using UnifiedQueryAPI from sql plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * refactor to plugin-plugin SPI Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * add readmes and start some clean up. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * analyzer errors Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * move fe plugin into analytics plugin for testing only, we will use sql plugin. also remove "hub" plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * spotless Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * more clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fixing analyzer issues Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava forbidden check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix license check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc warning on transitive dependency. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle and fix weird javadoc issues with dependencies. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix calcite/guava dependencies Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix package name Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * remove EngineCapabilities, just use calcite's sqloperatortable. wraps this and schema in an engineContext provided to front-ends Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * simplify unified IT to use params Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava NOTICE file to exactly match the file from grpc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * javadoc fix Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Update sandbox/plugins/analytics-engine/README.md Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Marc Handalian <handalm@amazon.com> --------- Signed-off-by: Marc Handalian <marc.handalian@gmail.com> Signed-off-by: Marc Handalian <handalm@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: shayush622 <ayush5267@gmail.com>
…ect#20697) * initial commit of extensible query engine plugin to sandbox Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle - update forbidden-dependencies to skip guava check in sandbox plugins, calcite requires this dependency at compile time Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Rename plugin interfaces and default implementations. Wire up a ppl front-end using UnifiedQueryAPI from sql plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * refactor to plugin-plugin SPI Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * add readmes and start some clean up. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * analyzer errors Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * move fe plugin into analytics plugin for testing only, we will use sql plugin. also remove "hub" plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * spotless Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * more clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fixing analyzer issues Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava forbidden check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix license check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc warning on transitive dependency. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle and fix weird javadoc issues with dependencies. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix calcite/guava dependencies Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix package name Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * remove EngineCapabilities, just use calcite's sqloperatortable. wraps this and schema in an engineContext provided to front-ends Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * simplify unified IT to use params Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava NOTICE file to exactly match the file from grpc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * javadoc fix Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Update sandbox/plugins/analytics-engine/README.md Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Marc Handalian <handalm@amazon.com> --------- Signed-off-by: Marc Handalian <marc.handalian@gmail.com> Signed-off-by: Marc Handalian <handalm@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: shayush622 <ayush5267@gmail.com>
…ect#20697) * initial commit of extensible query engine plugin to sandbox Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle - update forbidden-dependencies to skip guava check in sandbox plugins, calcite requires this dependency at compile time Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Rename plugin interfaces and default implementations. Wire up a ppl front-end using UnifiedQueryAPI from sql plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * refactor to plugin-plugin SPI Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * add readmes and start some clean up. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * analyzer errors Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * move fe plugin into analytics plugin for testing only, we will use sql plugin. also remove "hub" plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * spotless Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * more clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fixing analyzer issues Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava forbidden check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix license check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc warning on transitive dependency. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle and fix weird javadoc issues with dependencies. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix calcite/guava dependencies Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix package name Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * remove EngineCapabilities, just use calcite's sqloperatortable. wraps this and schema in an engineContext provided to front-ends Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * simplify unified IT to use params Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava NOTICE file to exactly match the file from grpc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * javadoc fix Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Update sandbox/plugins/analytics-engine/README.md Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Marc Handalian <handalm@amazon.com> --------- Signed-off-by: Marc Handalian <marc.handalian@gmail.com> Signed-off-by: Marc Handalian <handalm@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: shayush622 <ayush5267@gmail.com>
Description
This PR introduces a set of experimental plugins that we are building out related to this issue, that starts moving feature/engine-datafusion into main.
This introduces:
analytics-backend-datafusion - responsible for executing queries against datafusion back-end. No-Op right now.
analytics-engine - main "hub" for wiring up front-ends, back-ends, and a plan executor (component that is responsible for executing a query end-to-end).
For now its largely no-op, providing wiring for back-ends and the engine using ExtensiblePlugin. There is a test front-end inside of the engine that uses the UnifiedQueryAPI from sql plugin that will invoke our executor.
back-ends will implement
AnalyticsBackEndPluginand provide to the engine via spi. that for now defines its capabilities in terms of supported calcite operators/functions, and a path to execution via anEngineBridge. These common types are kept in a sandbox lib.Related Issues
Related issues:
#19902
#18416
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.