Skip to content

Commit 6a31779

Browse files
mch2andrross
authored andcommitted
initial commit of analytics engine plugin to sandbox (#20697)
* initial commit of extensible query engine plugin to sandbox Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle - update forbidden-dependencies to skip guava check in sandbox plugins, calcite requires this dependency at compile time Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Rename plugin interfaces and default implementations. Wire up a ppl front-end using UnifiedQueryAPI from sql plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * refactor to plugin-plugin SPI Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * add readmes and start some clean up. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * analyzer errors Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * move fe plugin into analytics plugin for testing only, we will use sql plugin. also remove "hub" plugin. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * spotless Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * more clean up Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fixing analyzer issues Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava forbidden check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix license check Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix javadoc warning on transitive dependency. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * clean up build.gradle and fix weird javadoc issues with dependencies. Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix calcite/guava dependencies Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix package name Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * remove EngineCapabilities, just use calcite's sqloperatortable. wraps this and schema in an engineContext provided to front-ends Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * simplify unified IT to use params Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * fix guava NOTICE file to exactly match the file from grpc Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * javadoc fix Signed-off-by: Marc Handalian <marc.handalian@gmail.com> * Update sandbox/plugins/analytics-engine/README.md Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Marc Handalian <handalm@amazon.com> --------- Signed-off-by: Marc Handalian <marc.handalian@gmail.com> Signed-off-by: Marc Handalian <handalm@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: shayush622 <ayush5267@gmail.com>
1 parent 32de70e commit 6a31779

File tree

131 files changed

+13557
-18
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

131 files changed

+13557
-18
lines changed
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# analytics-framework
2+
3+
Shared library containing the SPI interfaces and core types for the analytics engine. All plugins depend on this library — it defines the contracts but contains no implementation logic.
4+
5+
## SPI Interfaces
6+
7+
- **`QueryPlanExecutorPlugin`** — Factory for creating a `QueryPlanExecutor` from discovered back-end plugins.
8+
- **`AnalyticsBackEndPlugin`** — Extension point for native execution engines (DataFusion, Lucene, etc.). Exposes engine name, bridge, and capabilities.
9+
- **`AnalyticsFrontEndPlugin`** — Marker interface for query language front-ends (PPL, SQL). Discovered by the hub for lifecycle tracking.
10+
- **`SchemaProvider`** — Functional interface that builds a Calcite `SchemaPlus` from cluster state.
11+
12+
## Core Types
13+
14+
- **`QueryPlanExecutor`** — Executes a Calcite `RelNode` plan fragment and returns result rows.
15+
- **`EngineBridge<T>`** — JNI/native boundary for engine-specific plan conversion and execution (e.g., Substrait → Arrow batches).
16+
- **`AnalyticsEngineContext`** — Provides schema and aggregated operator table to front-ends for parsing and validation.
17+
18+
## Dependencies
19+
20+
Calcite and Arrow — no dependency on the OpenSearch server module.
Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
/*
2+
* SPDX-License-Identifier: Apache-2.0
3+
*
4+
* The OpenSearch Contributors require contributions made to
5+
* this file be licensed under the Apache-2.0 license or a
6+
* compatible open source license.
7+
*/
8+
9+
/*
10+
* Shared Calcite/Arrow types for the analytics engine plugins.
11+
* Contains EngineBridge, QueryPlanExecutor, AnalyticsEngineContext.
12+
* Plugins depend on this; the /modules SPI layer does NOT.
13+
*/
14+
15+
def calciteVersion = '1.41.0'
16+
17+
dependencies {
18+
api "org.apache.calcite:calcite-core:${calciteVersion}"
19+
// Calcite's expression tree and Enumerable runtime — required by calcite-core API
20+
api "org.apache.calcite:calcite-linq4j:${calciteVersion}"
21+
// Calcite's JDBC abstraction layer — required by calcite-core internals
22+
runtimeOnly 'org.apache.calcite.avatica:avatica-core:1.27.0'
23+
// Guava — required by Calcite internally, forbidden on compile classpaths by OpenSearch policy
24+
runtimeOnly "com.google.guava:guava:${versions.guava}"
25+
runtimeOnly 'com.google.guava:failureaccess:1.0.2'
26+
// SLF4J — Calcite's logging facade
27+
runtimeOnly "org.slf4j:slf4j-api:${versions.slf4j}"
28+
29+
// Calcite bytecode references annotations from apiguardian (@API) and
30+
// checker-framework (@EnsuresNonNullIf). compileOnlyApi propagates to
31+
// consumers' compile/javadoc classpath without becoming a runtime dep.
32+
compileOnlyApi 'org.apiguardian:apiguardian-api:1.1.2'
33+
compileOnlyApi 'org.checkerframework:checker-qual:3.43.0'
34+
}
35+
36+
testingConventions.enabled = false
37+
38+
// analytics-framework does not depend on server
39+
tasks.named('forbiddenApisMain').configure {
40+
replaceSignatureFiles 'jdk-signatures'
41+
failOnMissingClasses = false
42+
ignoreSignaturesOfMissingClasses = true
43+
}
44+
45+
// Calcite-core's optional runtime-scope transitive deps are not on the classpath.
46+
// These are features of Calcite we don't use (spatial, JSON path, JDBC pooling, etc.).
47+
// Split into multiple calls to stay under the JVM method parameter limit.
48+
tasks.named('thirdPartyAudit').configure {
49+
ignoreMissingClasses(
50+
// Jackson (optional JSON serialization in Calcite)
51+
'com.fasterxml.jackson.core.JsonParser$Feature',
52+
'com.fasterxml.jackson.core.PrettyPrinter',
53+
'com.fasterxml.jackson.core.type.TypeReference',
54+
'com.fasterxml.jackson.core.util.DefaultIndenter',
55+
'com.fasterxml.jackson.core.util.DefaultPrettyPrinter',
56+
'com.fasterxml.jackson.core.util.Separators',
57+
'com.fasterxml.jackson.core.util.Separators$Spacing',
58+
'com.fasterxml.jackson.databind.DeserializationFeature',
59+
'com.fasterxml.jackson.databind.ObjectMapper',
60+
'com.fasterxml.jackson.databind.ObjectWriter',
61+
62+
// Protobuf (Avatica RPC serialization, not used)
63+
'com.google.protobuf.AbstractMessageLite$Builder',
64+
'com.google.protobuf.AbstractParser',
65+
'com.google.protobuf.ByteString',
66+
'com.google.protobuf.CodedInputStream',
67+
'com.google.protobuf.CodedOutputStream',
68+
'com.google.protobuf.Descriptors$Descriptor',
69+
'com.google.protobuf.Descriptors$EnumDescriptor',
70+
'com.google.protobuf.Descriptors$EnumValueDescriptor',
71+
'com.google.protobuf.Descriptors$FieldDescriptor',
72+
'com.google.protobuf.Descriptors$FileDescriptor',
73+
'com.google.protobuf.Descriptors$OneofDescriptor',
74+
'com.google.protobuf.ExtensionRegistry',
75+
'com.google.protobuf.ExtensionRegistryLite',
76+
'com.google.protobuf.GeneratedMessageV3',
77+
'com.google.protobuf.GeneratedMessageV3$Builder',
78+
'com.google.protobuf.GeneratedMessageV3$BuilderParent',
79+
'com.google.protobuf.GeneratedMessageV3$FieldAccessorTable',
80+
'com.google.protobuf.GeneratedMessageV3$UnusedPrivateParameter',
81+
'com.google.protobuf.Internal',
82+
'com.google.protobuf.Internal$EnumLiteMap',
83+
'com.google.protobuf.Internal$IntList',
84+
'com.google.protobuf.Internal$LongList',
85+
'com.google.protobuf.InvalidProtocolBufferException',
86+
'com.google.protobuf.LazyStringArrayList',
87+
'com.google.protobuf.MapEntry',
88+
'com.google.protobuf.MapEntry$Builder',
89+
'com.google.protobuf.MapField',
90+
'com.google.protobuf.MapFieldReflectionAccessor',
91+
'com.google.protobuf.Message',
92+
'com.google.protobuf.MessageOrBuilder',
93+
'com.google.protobuf.Parser',
94+
'com.google.protobuf.ProtocolMessageEnum',
95+
'com.google.protobuf.ProtocolStringList',
96+
'com.google.protobuf.RepeatedFieldBuilderV3',
97+
'com.google.protobuf.SingleFieldBuilderV3',
98+
'com.google.protobuf.TextFormat',
99+
'com.google.protobuf.UninitializedMessageException',
100+
'com.google.protobuf.UnknownFieldSet',
101+
'com.google.protobuf.UnsafeByteOperations',
102+
'com.google.protobuf.WireFormat$FieldType',
103+
104+
// Uzaygezen (optional Hilbert curve spatial indexing)
105+
'com.google.uzaygezen.core.BacktrackingQueryBuilder',
106+
'com.google.uzaygezen.core.BitVector',
107+
'com.google.uzaygezen.core.BitVectorFactories',
108+
'com.google.uzaygezen.core.CompactHilbertCurve',
109+
'com.google.uzaygezen.core.FilteredIndexRange',
110+
'com.google.uzaygezen.core.Query',
111+
'com.google.uzaygezen.core.SimpleRegionInspector',
112+
'com.google.uzaygezen.core.ranges.LongRange',
113+
'com.google.uzaygezen.core.ranges.LongRangeHome',
114+
115+
// JsonPath (optional JSON path support)
116+
'com.jayway.jsonpath.Configuration',
117+
'com.jayway.jsonpath.Configuration$ConfigurationBuilder',
118+
'com.jayway.jsonpath.DocumentContext',
119+
'com.jayway.jsonpath.InvalidPathException',
120+
'com.jayway.jsonpath.JsonPath',
121+
'com.jayway.jsonpath.Option',
122+
'com.jayway.jsonpath.Predicate',
123+
'com.jayway.jsonpath.spi.json.JacksonJsonProvider',
124+
'com.jayway.jsonpath.spi.mapper.MappingProvider',
125+
126+
// Yahoo Sketches (optional approximate distinct counting)
127+
'com.yahoo.sketches.hll.HllSketch',
128+
'com.yahoo.sketches.hll.HllSketchBuilder',
129+
130+
// Avatica metrics (optional metrics subsystem)
131+
'org.apache.calcite.avatica.metrics.MetricsSystem',
132+
'org.apache.calcite.avatica.metrics.Timer',
133+
'org.apache.calcite.avatica.metrics.Timer$Context',
134+
'org.apache.calcite.avatica.metrics.noop.NoopMetricsSystem',
135+
136+
// Apache Commons (optional Calcite features)
137+
'org.apache.commons.codec.binary.Base32',
138+
'org.apache.commons.codec.binary.Hex',
139+
'org.apache.commons.codec.digest.DigestUtils',
140+
'org.apache.commons.codec.language.Soundex',
141+
'org.apache.commons.dbcp2.BasicDataSource',
142+
'org.apache.commons.io.IOUtils',
143+
'org.apache.commons.lang3.StringUtils',
144+
'org.apache.commons.lang3.Strings',
145+
'org.apache.commons.lang3.mutable.MutableBoolean',
146+
'org.apache.commons.math3.fraction.BigFraction',
147+
'org.apache.commons.math3.util.CombinatoricsUtils',
148+
'org.apache.commons.text.StringEscapeUtils',
149+
'org.apache.commons.text.similarity.LevenshteinDistance'
150+
)
151+
152+
ignoreMissingClasses(
153+
// HttpClient5 (Avatica remote JDBC transport, not used)
154+
'org.apache.hc.client5.http.SystemDefaultDnsResolver',
155+
'org.apache.hc.client5.http.auth.AuthScope',
156+
'org.apache.hc.client5.http.auth.Credentials',
157+
'org.apache.hc.client5.http.auth.CredentialsProvider',
158+
'org.apache.hc.client5.http.auth.KerberosConfig',
159+
'org.apache.hc.client5.http.auth.KerberosConfig$Builder',
160+
'org.apache.hc.client5.http.auth.UsernamePasswordCredentials',
161+
'org.apache.hc.client5.http.classic.methods.HttpPost',
162+
'org.apache.hc.client5.http.config.RequestConfig',
163+
'org.apache.hc.client5.http.config.RequestConfig$Builder',
164+
'org.apache.hc.client5.http.impl.auth.BasicAuthCache',
165+
'org.apache.hc.client5.http.impl.auth.BasicCredentialsProvider',
166+
'org.apache.hc.client5.http.impl.classic.CloseableHttpClient',
167+
'org.apache.hc.client5.http.impl.classic.HttpClientBuilder',
168+
'org.apache.hc.client5.http.impl.classic.HttpClients',
169+
'org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager',
170+
'org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManagerBuilder',
171+
'org.apache.hc.client5.http.protocol.HttpClientContext',
172+
'org.apache.hc.client5.http.routing.RoutingSupport',
173+
'org.apache.hc.client5.http.ssl.HttpsSupport',
174+
'org.apache.hc.client5.http.ssl.NoopHostnameVerifier',
175+
'org.apache.hc.client5.http.ssl.TlsSocketStrategy',
176+
'org.apache.hc.core5.http.ClassicHttpResponse',
177+
'org.apache.hc.core5.http.ContentType',
178+
'org.apache.hc.core5.http.HttpHost',
179+
'org.apache.hc.core5.http.config.Lookup',
180+
'org.apache.hc.core5.http.config.RegistryBuilder',
181+
'org.apache.hc.core5.http.io.entity.EntityUtils',
182+
'org.apache.hc.core5.ssl.SSLContextBuilder',
183+
'org.apache.hc.core5.ssl.SSLContexts',
184+
'org.apache.hc.core5.util.Timeout',
185+
186+
// Janino (optional code generation for Enumerable pipeline)
187+
'org.codehaus.commons.compiler.CompilerFactoryFactory',
188+
'org.codehaus.commons.compiler.IClassBodyEvaluator',
189+
'org.codehaus.commons.compiler.ICompilerFactory',
190+
'org.codehaus.commons.compiler.ISimpleCompiler',
191+
'org.codehaus.commons.compiler.util.resource.ResourceFinder',
192+
'org.codehaus.janino.ClassBodyEvaluator',
193+
'org.codehaus.janino.JavaSourceClassLoader',
194+
'org.codehaus.janino.util.ClassFile',
195+
196+
// jOOU (optional unsigned integer types)
197+
'org.joou.UByte',
198+
'org.joou.UInteger',
199+
'org.joou.ULong',
200+
'org.joou.UShort',
201+
'org.joou.Unsigned',
202+
203+
// JTS / Proj4j (optional spatial/geometry support)
204+
'org.locationtech.jts.algorithm.InteriorPoint',
205+
'org.locationtech.jts.algorithm.LineIntersector',
206+
'org.locationtech.jts.algorithm.MinimumBoundingCircle',
207+
'org.locationtech.jts.algorithm.MinimumDiameter',
208+
'org.locationtech.jts.densify.Densifier',
209+
'org.locationtech.jts.geom.Coordinate',
210+
'org.locationtech.jts.geom.CoordinateSequence',
211+
'org.locationtech.jts.geom.CoordinateSequenceFactory',
212+
'org.locationtech.jts.geom.Envelope',
213+
'org.locationtech.jts.geom.Geometry',
214+
'org.locationtech.jts.geom.GeometryCollection',
215+
'org.locationtech.jts.geom.GeometryFactory',
216+
'org.locationtech.jts.geom.GeometryFilter',
217+
'org.locationtech.jts.geom.IntersectionMatrix',
218+
'org.locationtech.jts.geom.LineSegment',
219+
'org.locationtech.jts.geom.LineString',
220+
'org.locationtech.jts.geom.LinearRing',
221+
'org.locationtech.jts.geom.MultiLineString',
222+
'org.locationtech.jts.geom.MultiPoint',
223+
'org.locationtech.jts.geom.MultiPolygon',
224+
'org.locationtech.jts.geom.OctagonalEnvelope',
225+
'org.locationtech.jts.geom.Point',
226+
'org.locationtech.jts.geom.Polygon',
227+
'org.locationtech.jts.geom.util.AffineTransformation',
228+
'org.locationtech.jts.geom.util.GeometryEditor',
229+
'org.locationtech.jts.geom.util.GeometryEditor$CoordinateOperation',
230+
'org.locationtech.jts.geom.util.GeometryFixer',
231+
'org.locationtech.jts.geom.util.GeometryTransformer',
232+
'org.locationtech.jts.geom.util.LineStringExtracter',
233+
'org.locationtech.jts.io.WKBReader',
234+
'org.locationtech.jts.io.WKBWriter',
235+
'org.locationtech.jts.io.WKTReader',
236+
'org.locationtech.jts.io.WKTWriter',
237+
'org.locationtech.jts.io.geojson.GeoJsonReader',
238+
'org.locationtech.jts.io.geojson.GeoJsonWriter',
239+
'org.locationtech.jts.io.gml2.GMLReader',
240+
'org.locationtech.jts.io.gml2.GMLWriter',
241+
'org.locationtech.jts.linearref.LengthIndexedLine',
242+
'org.locationtech.jts.operation.buffer.BufferOp',
243+
'org.locationtech.jts.operation.buffer.BufferParameters',
244+
'org.locationtech.jts.operation.buffer.OffsetCurve',
245+
'org.locationtech.jts.operation.distance.DistanceOp',
246+
'org.locationtech.jts.operation.linemerge.LineMerger',
247+
'org.locationtech.jts.operation.overlay.snap.GeometrySnapper',
248+
'org.locationtech.jts.operation.polygonize.Polygonizer',
249+
'org.locationtech.jts.operation.union.UnaryUnionOp',
250+
'org.locationtech.jts.precision.GeometryPrecisionReducer',
251+
'org.locationtech.jts.simplify.DouglasPeuckerSimplifier',
252+
'org.locationtech.jts.simplify.TopologyPreservingSimplifier',
253+
'org.locationtech.jts.triangulate.DelaunayTriangulationBuilder',
254+
'org.locationtech.jts.triangulate.polygon.ConstrainedDelaunayTriangulator',
255+
'org.locationtech.jts.triangulate.quadedge.QuadEdgeSubdivision',
256+
'org.locationtech.jts.triangulate.tri.Tri',
257+
'org.locationtech.jts.util.GeometricShapeFactory',
258+
'org.locationtech.proj4j.CRSFactory',
259+
'org.locationtech.proj4j.CoordinateReferenceSystem',
260+
'org.locationtech.proj4j.CoordinateTransform',
261+
'org.locationtech.proj4j.CoordinateTransformFactory',
262+
'org.locationtech.proj4j.ProjCoordinate',
263+
'org.locationtech.proj4j.proj.Projection',
264+
265+
// Pentaho (optional aggregate designer)
266+
'org.pentaho.aggdes.algorithm.Algorithm',
267+
'org.pentaho.aggdes.algorithm.Algorithm$ParameterEnum',
268+
'org.pentaho.aggdes.algorithm.Result',
269+
'org.pentaho.aggdes.model.Aggregate',
270+
'org.pentaho.aggdes.model.Attribute',
271+
'org.pentaho.aggdes.model.Dialect',
272+
'org.pentaho.aggdes.model.Schema',
273+
'org.pentaho.aggdes.model.StatisticsProvider',
274+
'org.pentaho.aggdes.model.Table'
275+
)
276+
277+
// Guava internal Unsafe usage — standard for any module depending on Guava
278+
ignoreViolations(
279+
'com.google.common.cache.Striped64',
280+
'com.google.common.cache.Striped64$1',
281+
'com.google.common.cache.Striped64$Cell',
282+
'com.google.common.hash.LittleEndianByteArray$UnsafeByteArray',
283+
'com.google.common.hash.LittleEndianByteArray$UnsafeByteArray$1',
284+
'com.google.common.hash.LittleEndianByteArray$UnsafeByteArray$2',
285+
'com.google.common.hash.Striped64',
286+
'com.google.common.hash.Striped64$1',
287+
'com.google.common.hash.Striped64$Cell',
288+
'com.google.common.primitives.UnsignedBytes$LexicographicalComparatorHolder$UnsafeComparator',
289+
'com.google.common.primitives.UnsignedBytes$LexicographicalComparatorHolder$UnsafeComparator$1',
290+
'com.google.common.util.concurrent.AbstractFuture$UnsafeAtomicHelper',
291+
'com.google.common.util.concurrent.AbstractFuture$UnsafeAtomicHelper$1'
292+
)
293+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
59990a6fd24dc7f398fcb06cdd570a99de7a7c4f

0 commit comments

Comments
 (0)