Skip to content

Merge feature/calcite-engine to main#3448

Merged
penghuo merged 38 commits intoopensearch-project:mainfrom
penghuo:calciteEngineMergeV2
Mar 21, 2025
Merged

Merge feature/calcite-engine to main#3448
penghuo merged 38 commits intoopensearch-project:mainfrom
penghuo:calciteEngineMergeV2

Conversation

@penghuo
Copy link
Copy Markdown
Collaborator

@penghuo penghuo commented Mar 19, 2025

Description

Merge feature branch to main

Related Issues

n/a

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

LantaoJin and others added 30 commits January 17, 2025 13:58
…er (opensearch-project#3249)

* First commit for Calcite integration

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* disable java security manager in IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
…ject#3258)

* [POC] Make Calcite execute successfully

Signed-off-by: Heng Qian <qianheng@amazon.com>

* [POC] Change caching schema to simple schema and avoid registering table when visitRelation.

Signed-off-by: Heng Qian <qianheng@amazon.com>

* spotlessApply

Signed-off-by: Heng Qian <qianheng@amazon.com>

* address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
* Make basic aggregation working (partial)

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* add a settings to enable calcite

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* add more UTs

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
opensearch-project#3327)

* Support Filter and Project pushdown

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Support Filter and Project pushdown v2

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add original license for PredicateAnalyzer

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
* Build integration test framework

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* make local work

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix the timestamp issue

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* address comments

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix java style and rename CalcitePPLTestCase back to CalcitePPLIntegTestCase

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
…oject#3355)

* Add more aggregation tests

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* delete unrelavant code

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
* Transform to calcite plan before executing

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix bug for single column row

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add settings for calcite pushdown

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Lazily construct OpenSearchRequestBuilder and do push down

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments and disable push down

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
* Fix PredicateAnalyzer for in and notIn

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Change text field to keyword since we don't support push down for that type

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
…3376)

* [BugFix] Fix text field push down

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Ignore CalciteSortCommandIT.testSortWithNullValue

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Refine code: only get keyword subfield for termQuery builder

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Refine code

Signed-off-by: Heng Qian <qianheng@amazon.com>

* remove ignore tests in CalcitePPLInSubqueryIT

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
* add udf/udaf interface and take/sqrt function

Signed-off-by: xinyual <xinyual@amazon.com>

* add UT

Signed-off-by: xinyual <xinyual@amazon.com>

* add POW, Atan, Atan2 and corresponding UT

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

* fix table for join it

Signed-off-by: xinyual <xinyual@amazon.com>

* add java doc

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>
…t#3392)

* Implement ppl scalar subquery command with Calcite

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* more general subquery checker

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* support correlated IN subquery

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
* Change push down to logical index scan

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Support Aggregate Push Down

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Rebase and resolve conflict

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add TODO

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
* add string udfs

Signed-off-by: xinyual <xinyual@amazon.com>

* add it to string

Signed-off-by: xinyual <xinyual@amazon.com>

* add IT for string function

Signed-off-by: xinyual <xinyual@amazon.com>

* remove change for local test

Signed-off-by: xinyual <xinyual@amazon.com>

* revert change

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>
…nsearch-project#3405)

* Keep aggregation in Calcite consistent with current PPL behavior

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* remove unrelated code

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* revert some code

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix issue 3404

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* add more tests

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* address comments

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* add more tests

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
* Support multiple table and index pattern

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix UT

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
* add condition udfs

Signed-off-by: xinyual <xinyual@amazon.com>

* add IT for conditions and register null table

Signed-off-by: xinyual <xinyual@amazon.com>

* fix it

Signed-off-by: xinyual <xinyual@amazon.com>

* update utils define

Signed-off-by: xinyual <xinyual@amazon.com>

* add condition functions

Signed-off-by: xinyual <xinyual@amazon.com>

* modify IT

Signed-off-by: xinyual <xinyual@amazon.com>

* fix IT

Signed-off-by: xinyual <xinyual@amazon.com>

* revert useless change and add comments

Signed-off-by: xinyual <xinyual@amazon.com>

* reverse typo and apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>
* Revert alias change, Fix IT

Signed-off-by: Peng Huo <penghuo@gmail.com>

* Fix spotlessCheck

Signed-off-by: Peng Huo <penghuo@gmail.com>

* Revert development test

Signed-off-by: Peng Huo <penghuo@gmail.com>

* Fix PPL Test

Signed-off-by: Peng Huo <penghuo@gmail.com>

* license header

Signed-off-by: Peng Huo <penghuo@gmail.com>

* Ignore flaky test

Signed-off-by: Peng Huo <penghuo@gmail.com>

---------

Signed-off-by: Peng Huo <penghuo@gmail.com>
* revert result ordering of stats-by

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix CRLF issue

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* only check spark sql

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
* Implement ppl lookup command with Calcite

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* step 2

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* add all lookup IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Support lookup command

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Refactor Lookup

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Refine Code

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix UT

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add anonymizer for lookup

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Refine code

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix UT

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Co-authored-by: Heng Qian <qianheng@amazon.com>
Co-authored-by: Lantao Jin <ltjin@amazon.com>
* Support ppl BETWEEN operation within Calcite

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* add more tests

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
* Correct the precedence for logical operators

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix flaky test

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
* Implement ppl dedup command with Calcite

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* remove union

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>
@penghuo penghuo changed the title Calcite engine merge v2 Merge feature/calcite-engine to main Mar 19, 2025
@penghuo penghuo marked this pull request as ready for review March 19, 2025 22:40
@LantaoJin
Copy link
Copy Markdown
Member

The DCO fails with weird summary, for example:

Commit sha: a62d87d, Author: Lantao Jin, Committer: GitHub; The sign-off is missing.

but commit a62d87d was introduced by https://github.com/opensearch-project/sql/pull/3371/checks which is signed off.

xinyual and others added 2 commits March 20, 2025 10:55
* add math udfs

Signed-off-by: xinyual <xinyual@amazon.com>

* add log argument

Signed-off-by: xinyual <xinyual@amazon.com>

* Add math function unit tests
- Additionally implement user-defined ConvFunction

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add integration tests for Calcite math functions

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Rename CalcitePPLMathFunctionsIT to CalcitePPLBuiltinFunctionIT

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* add license

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spot

Signed-off-by: xinyual <xinyual@amazon.com>

* Update the implementation of CONV function to align with v2's behavior
- Rename UserDefineFunctionUtils to UserDefinedFunctionUtils

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Improve code style:
- enforce uniform parameter number check
- comment on differences from calcite's implementation if necessary

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Simplify Calcite PPL math function unit tests

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Alter MOD and SQRT UDF to conform to documented behaviors
- return null with invalid (zero, negative) arguments
- return wider type for mod

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Complicate math integration tests
- edge cases for UDF
- combine operations or clauses

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Handle NULL return in ASIN, ACOS, SQRT and POW by convert returned Double.NaN and Float.NaN to null

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Apply spotless on math UDFs and their tests

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Remove unnecessary Double cast in SQRT UDF

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Convert returned Double.NaN and Float.NaN from math UDFs to LITERAL_NULL

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Correct math UDF integration tests
- remove comparision between string and integers
- correct thrown error types

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Update MOD UDF
- add alias % to MOD
- return negative when the dividend is negative

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Modify substring ITs

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* apply spot

Signed-off-by: xinyual <xinyual@amazon.com>

* Replace containsMessage with verifyErrorMessageContains in math ITs

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Correct MOD return types
- additionally enrich math ITs with fields calculations

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* fix UT

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Co-authored-by: xinyual <xinyual@amazon.com>
Co-authored-by: Yuanchun Shen <yuanchu@amazon.com>
* Fix flaky tests: testSubstring, testPosition, testLike

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Keep generated code from spotless check

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

---------

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
LantaoJin
LantaoJin previously approved these changes Mar 20, 2025
Signed-off-by: Peng Huo <penghuo@gmail.com>
Copy link
Copy Markdown
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, no squash merging please.

@penghuo penghuo merged commit 32fc251 into opensearch-project:main Mar 21, 2025
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

calcite calcite migration releated

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants