-
Notifications
You must be signed in to change notification settings - Fork 2.5k
initial commit of analytics engine plugin to sandbox #20697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 16 commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
51d6eed
initial commit of extensible query engine plugin to sandbox
mch2 5fe1320
clean up build.gradle - update forbidden-dependencies to skip guava c…
mch2 e52d6fd
Rename plugin interfaces and default implementations.
mch2 ee0c068
refactor to plugin-plugin SPI
mch2 4bd148b
add readmes and start some clean up.
mch2 53cbbb8
analyzer errors
mch2 d7185ac
move fe plugin into analytics plugin for testing only, we will use s…
mch2 63b9e83
spotless
mch2 169ddc7
clean up
mch2 e14f80e
more clean up
mch2 1ca8808
fixing analyzer issues
mch2 4782c74
fix javadoc
mch2 f266e74
fix guava forbidden check
mch2 2f106ac
fix license check
mch2 e17e433
fix javadoc warning on transitive dependency.
mch2 47567de
clean up build.gradle and fix weird javadoc issues with dependencies.
mch2 478343f
fix calcite/guava dependencies
mch2 a81ce24
fix package name
mch2 8a92746
remove EngineCapabilities, just use calcite's sqloperatortable.
mch2 e16a4ad
simplify unified IT to use params
mch2 efbaef5
fix guava NOTICE file to exactly match the file from grpc
mch2 e1e7bd0
javadoc fix
mch2 90e55df
Update sandbox/plugins/analytics-engine/README.md
mch2 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # analytics-framework | ||
|
|
||
| Shared library containing the SPI interfaces and core types for the analytics engine. All plugins depend on this library — it defines the contracts but contains no implementation logic. | ||
|
|
||
| ## SPI Interfaces | ||
|
|
||
| - **`QueryPlanExecutorPlugin`** — Factory for creating a `QueryPlanExecutor` from discovered back-end plugins. | ||
| - **`AnalyticsBackEndPlugin`** — Extension point for native execution engines (DataFusion, Lucene, etc.). Exposes engine name, bridge, and capabilities. | ||
| - **`AnalyticsFrontEndPlugin`** — Marker interface for query language front-ends (PPL, SQL). Discovered by the hub for lifecycle tracking. | ||
| - **`SchemaProvider`** — Functional interface that builds a Calcite `SchemaPlus` from cluster state. | ||
|
|
||
| ## Core Types | ||
|
|
||
| - **`QueryPlanExecutor`** — Executes a Calcite `RelNode` plan fragment and returns result rows. | ||
| - **`EngineBridge<T>`** — JNI/native boundary for engine-specific plan conversion and execution (e.g., Substrait → Arrow batches). | ||
| - **`EngineCapabilities`** — Declares supported operators and functions. Used by the push-down planner to decide what gets absorbed into engine-executed boundary nodes vs. what stays in Calcite's in-process execution. | ||
|
|
||
| ## Dependencies | ||
|
|
||
| Calcite and Arrow — no dependency on the OpenSearch server module. |
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
sandbox/libs/analytics-framework/licenses/calcite-core-1.41.0.jar.sha1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| 0dd7b4be638f0cea174f78cc851322b64d813a1e |
2,261 changes: 2,261 additions & 0 deletions
2,261
sandbox/libs/analytics-framework/licenses/calcite-core-LICENSE.txt
Large diffs are not rendered by default.
Oops, something went wrong.
5 changes: 5 additions & 0 deletions
5
sandbox/libs/analytics-framework/licenses/calcite-core-NOTICE.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| Apache Calcite | ||
| Copyright 2012-2024 The Apache Software Foundation | ||
|
|
||
| This product includes software developed at | ||
| The Apache Software Foundation (http://www.apache.org/). |
57 changes: 57 additions & 0 deletions
57
...libs/analytics-framework/src/main/java/org/opensearch/analytics/backend/EngineBridge.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| package org.opensearch.analytics.backend; | ||
|
|
||
| /** | ||
| * JNI boundary interface between the query planner (Java) and a native | ||
| * execution engine (e.g., DataFusion/Rust). | ||
| * | ||
| * <p>The bridge has two responsibilities: | ||
| * <ol> | ||
| * <li>{@link #convertFragment} — serialise a logical plan fragment into | ||
| * the engine's wire format (e.g., Substrait bytes).</li> | ||
| * <li>{@link #execute} — hand the serialised plan to the native engine | ||
| * and obtain an opaque handle to the result stream that lives | ||
| * entirely in native memory.</li> | ||
| * </ol> | ||
| * | ||
| * <p>Arrow data never crosses the JNI boundary into the JVM heap. | ||
| * Consumers read from the native stream via Arrow Flight or | ||
| * direct native-memory access using the returned handle. | ||
| * | ||
| * @param <Fragment> serialised plan type (e.g., {@code byte[]} for Substrait) | ||
| * @param <Stream> result stream handle | ||
| * @param <LogicalPlan>> logical plan type (e.g., Calcite {@code RelNode}) | ||
| * @opensearch.internal | ||
| */ | ||
| public interface EngineBridge<Fragment, Stream, LogicalPlan> { | ||
|
|
||
| /** | ||
| * Converts a logical plan fragment into the native engine's serialised | ||
| * format. | ||
| * | ||
| * @param fragment the logical plan subtree to serialise | ||
| * @return the serialised plan in the engine's wire format | ||
| */ | ||
| Fragment convertFragment(LogicalPlan fragment); | ||
|
|
||
| /** | ||
| * Submits the serialised plan to the native engine for execution and | ||
| * returns an opaque handle to the result stream. | ||
| * | ||
| * <p>The returned handle is a pointer into native memory (e.g., a | ||
| * {@code long} address of a Rust {@code RecordBatchStream}). The | ||
| * caller must eventually close the stream through a corresponding | ||
| * native call to avoid leaking resources. | ||
| * | ||
| * @param fragment the serialised plan produced by {@link #convertFragment} | ||
| * @return an opaque handle to the native result stream | ||
| */ | ||
| Stream execute(Fragment fragment); | ||
| } |
112 changes: 112 additions & 0 deletions
112
...nalytics-framework/src/main/java/org/opensearch/analytics/backend/EngineCapabilities.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,112 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| package org.opensearch.analytics.backend; | ||
|
|
||
| import org.apache.calcite.rel.RelNode; | ||
| import org.apache.calcite.rel.core.AggregateCall; | ||
| import org.apache.calcite.rel.logical.LogicalAggregate; | ||
| import org.apache.calcite.rel.logical.LogicalFilter; | ||
| import org.apache.calcite.rel.logical.LogicalSort; | ||
| import org.apache.calcite.rel.logical.LogicalTableScan; | ||
| import org.apache.calcite.rex.RexCall; | ||
| import org.apache.calcite.rex.RexNode; | ||
| import org.apache.calcite.rex.RexVisitorImpl; | ||
| import org.apache.calcite.sql.SqlOperator; | ||
| import org.apache.calcite.sql.fun.SqlStdOperatorTable; | ||
|
|
||
| import java.util.HashSet; | ||
| import java.util.List; | ||
| import java.util.Set; | ||
|
|
||
| /** | ||
| * Declares what the custom engine supports using Calcite's own types. | ||
| */ | ||
| public class EngineCapabilities { | ||
|
|
||
| private final Set<Class<? extends RelNode>> supportedOperators; | ||
| private final Set<SqlOperator> supportedFunctions; | ||
|
|
||
| /** | ||
| * Creates capabilities from explicit operator and function sets. | ||
| * | ||
| * @param supportedOperators relational operator classes the engine can execute | ||
| * @param supportedFunctions scalar and aggregate functions the engine supports | ||
| */ | ||
| public EngineCapabilities(Set<Class<? extends RelNode>> supportedOperators, Set<SqlOperator> supportedFunctions) { | ||
| this.supportedOperators = Set.copyOf(supportedOperators); | ||
| this.supportedFunctions = Set.copyOf(supportedFunctions); | ||
| } | ||
|
|
||
| /** Returns capabilities covering standard Calcite logical operators and all built-in functions. */ | ||
| public static EngineCapabilities defaultCapabilities() { | ||
| return new EngineCapabilities( | ||
| Set.of(LogicalTableScan.class, LogicalFilter.class, LogicalAggregate.class, LogicalSort.class), | ||
| new HashSet<>(SqlStdOperatorTable.instance().getOperatorList()) | ||
| ); | ||
| } | ||
|
|
||
| /** | ||
| * Returns {@code true} if the engine can execute the given relational operator. | ||
| * | ||
| * @param node the relational operator to check | ||
| */ | ||
| public boolean supportsOperator(RelNode node) { | ||
| return supportedOperators.contains(node.getClass()); | ||
| } | ||
|
|
||
| /** | ||
| * Returns {@code true} if every scalar function in the expression tree is supported. | ||
| * | ||
| * @param expression the row expression tree to check | ||
| */ | ||
| public boolean supportsAllFunctions(RexNode expression) { | ||
| if (expression == null) { | ||
| return true; | ||
| } | ||
| Boolean result = expression.accept(new FunctionSupportVisitor()); | ||
| return result == null || result; | ||
| } | ||
|
|
||
| private class FunctionSupportVisitor extends RexVisitorImpl<Boolean> { | ||
| FunctionSupportVisitor() { | ||
| super(true); | ||
| } | ||
|
|
||
| @Override | ||
| public Boolean visitCall(RexCall call) { | ||
| if (!supportedFunctions.contains(call.getOperator())) { | ||
| return false; | ||
| } | ||
| for (RexNode operand : call.getOperands()) { | ||
| Boolean childResult = operand.accept(this); | ||
| if (childResult != null && !childResult) { | ||
| return false; | ||
| } | ||
| } | ||
| return true; | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Returns {@code true} if every aggregate function in the list is supported. | ||
| * | ||
| * @param aggCalls the aggregate calls to check | ||
| */ | ||
| public boolean supportsAllAggFunctions(List<AggregateCall> aggCalls) { | ||
mch2 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| if (aggCalls == null || aggCalls.isEmpty()) { | ||
| return true; | ||
| } | ||
| for (AggregateCall aggCall : aggCalls) { | ||
| if (!supportedFunctions.contains(aggCall.getAggregation())) { | ||
| return false; | ||
| } | ||
| } | ||
| return true; | ||
| } | ||
| } | ||
12 changes: 12 additions & 0 deletions
12
...libs/analytics-framework/src/main/java/org/opensearch/analytics/backend/package-info.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| /** | ||
| * Back-end engine abstractions: bridge interface and capability declarations. | ||
| */ | ||
| package org.opensearch.analytics.backend; |
27 changes: 27 additions & 0 deletions
27
...bs/analytics-framework/src/main/java/org/opensearch/analytics/exec/QueryPlanExecutor.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| package org.opensearch.analytics.exec; | ||
|
|
||
| /** | ||
| * Executes a logical query plan fragment against the underlying data store. | ||
| * | ||
| * @opensearch.internal | ||
| */ | ||
| @FunctionalInterface | ||
| public interface QueryPlanExecutor<LogicalPlan, Stream> { | ||
|
|
||
| /** | ||
| * Executes the given logical fragment and returns result rows. | ||
| * | ||
| * @param plan the logical subtree to execute | ||
| * @param context execution context (opaque Object to avoid server dependency) | ||
| * @return rows produced by the engine | ||
| */ | ||
| Stream execute(LogicalPlan plan, Object context); | ||
| } |
12 changes: 12 additions & 0 deletions
12
...ox/libs/analytics-framework/src/main/java/org/opensearch/analytics/exec/package-info.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| /** | ||
| * Query plan execution interfaces. | ||
| */ | ||
| package org.opensearch.analytics.exec; |
29 changes: 29 additions & 0 deletions
29
...ibs/analytics-framework/src/main/java/org/opensearch/analytics/schema/SchemaProvider.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| package org.opensearch.analytics.schema; | ||
|
|
||
| import org.apache.calcite.schema.SchemaPlus; | ||
|
|
||
| /** | ||
| * Provides a Calcite {@link SchemaPlus} from the current cluster state. | ||
| * | ||
| * @opensearch.internal | ||
| */ | ||
| @FunctionalInterface | ||
| public interface SchemaProvider { | ||
|
|
||
| /** | ||
| * Builds a Calcite {@link SchemaPlus} from the given cluster state. | ||
| * | ||
| * @param clusterState the current cluster state (opaque Object to avoid | ||
| * server dependency in the library) | ||
| * @return a SchemaPlus with tables derived from index mappings | ||
| */ | ||
| SchemaPlus buildSchema(Object clusterState); | ||
| } |
12 changes: 12 additions & 0 deletions
12
.../libs/analytics-framework/src/main/java/org/opensearch/analytics/schema/package-info.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| /** | ||
| * Schema construction from OpenSearch cluster metadata. | ||
| */ | ||
| package org.opensearch.analytics.schema; |
27 changes: 27 additions & 0 deletions
27
...nalytics-framework/src/main/java/org/opensearch/analytics/spi/AnalyticsBackEndPlugin.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| package org.opensearch.analytics.spi; | ||
|
|
||
| import org.opensearch.analytics.backend.EngineBridge; | ||
| import org.opensearch.analytics.backend.EngineCapabilities; | ||
|
|
||
| /** | ||
| * SPI extension point for back-end query engines (DataFusion, Lucene, etc.). | ||
| * @opensearch.internal | ||
| */ | ||
| public interface AnalyticsBackEndPlugin { | ||
| /** Unique engine name (e.g., "lucene", "datafusion"). */ | ||
mch2 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| String name(); | ||
|
|
||
| /** JNI boundary for executing serialized plans, or null for engines without native execution. */ | ||
| EngineBridge<?, ?, ?> bridge(); | ||
|
|
||
| /** Engine capabilities describing supported operators/functions, or null. */ | ||
| EngineCapabilities getEngineCapabilities(); | ||
| } | ||
12 changes: 12 additions & 0 deletions
12
...box/libs/analytics-framework/src/main/java/org/opensearch/analytics/spi/package-info.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| /** | ||
| * SPI extension points for analytics back-end and front-end plugins. | ||
| */ | ||
| package org.opensearch.analytics.spi; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| # analytics-backend-datafusion | ||
|
|
||
| DataFusion native execution engine plugin. Implements `AnalyticsBackEndPlugin` to provide a back-end that can execute query plan fragments via JNI. | ||
|
|
||
| ## What it does | ||
|
|
||
| Exposes a `DataFusionBridge` (`EngineBridge<byte[]>`) that converts Calcite `RelNode` fragments into a serialized plan format and executes them through a native Rust/DataFusion library. Currently a stub. | ||
|
|
||
| ## How it fits in | ||
|
|
||
| Declares `extendedPlugins = ['analytics-engine']` so the hub discovers it as an `AnalyticsBackEndPlugin`. The hub passes all discovered back-ends to the `QueryPlanExecutorPlugin` during executor creation. The executor will eventually use the bridge and capabilities to route plan fragments to the appropriate engine. | ||
|
|
||
| ## Key classes | ||
|
|
||
| - **`DataFusionPlugin`** — The `AnalyticsBackEndPlugin` SPI implementation. Reports `name() = "datafusion"`. | ||
| - **`DataFusionBridge`** — The `EngineBridge<byte[]>` implementation for native execution. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| /* | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| * | ||
| * The OpenSearch Contributors require contributions made to | ||
| * this file be licensed under the Apache-2.0 license or a | ||
| * compatible open source license. | ||
| */ | ||
|
|
||
| opensearchplugin { | ||
| description = 'DataFusion native execution engine plugin for the query engine.' | ||
| classname = 'org.opensearch.be.datafusion.DataFusionPlugin' | ||
| extendedPlugins = ['analytics-engine'] | ||
| } | ||
|
|
||
| dependencies { | ||
| // Shared types and SPI interfaces (EngineBridge, EngineCapabilities, AnalyticsBackEndPlugin, etc.) | ||
| // Also provides calcite-core transitively via api. | ||
| api project(':sandbox:libs:analytics-framework') | ||
| } | ||
|
|
||
| // TODO: Remove once back-end is built out with test suite | ||
| testingConventions.enabled = false |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.