Skip to content

Comments

Unify PPL execution in Spark via unified query API#1313

Merged
dai-chen merged 13 commits intoopensearch-project:feature/unified-ppl-sparkfrom
dai-chen:add-unified-query-spark-integration-2
Feb 11, 2026
Merged

Unify PPL execution in Spark via unified query API#1313
dai-chen merged 13 commits intoopensearch-project:feature/unified-ppl-sparkfrom
dai-chen:add-unified-query-spark-integration-2

Conversation

@dai-chen
Copy link
Collaborator

@dai-chen dai-chen commented Jan 29, 2026

Description

This PR introduces the unified-query-spark-integration module that bridges OpenSearch's Calcite-based PPL engine with Apache Spark. This enables consistent PPL query behavior across OpenSearch and Spark by leveraging the unified query API for PPL transpilation. See unified-query-spark-integration/README.md for the architecture overview and usage details.

Key pieces included:

  • Custom Spark PPL parser using unified query API: detects PPL input, invokes the unified query planner/transpiler, and forwards the resulting Spark SQL for execution.
  • Spark ↔ Calcite catalog schema bridge: maps Spark’s catalog hierarchy (catalogs, databases, tables) into Calcite schema/table abstractions so the unified query planner can resolve relations against Spark metadata.

Note: This module depends on unified-query-api:2.19.4.0-SNAPSHOT from the OpenSearch SQL project. This will be pinned to a released version once the unified query API is stable.

Related Issues

Check List

  • Updated documentation (docs/ppl-lang/README.md)
  • Implemented unit tests
  • Implemented tests for combination with other commands
  • New added source code should include a copyright header
  • Commits are signed per the DCO using --signoff
  • Add backport 0.x label if it is a stable change which won't break existing feature

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen self-assigned this Jan 29, 2026
@dai-chen dai-chen added the enhancement New feature or request label Jan 29, 2026
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
…rk-integration

Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen merged commit 7ddbf1e into opensearch-project:feature/unified-ppl-spark Feb 11, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants