Skip to content

Comments

fix(native): Replace lambda body with optimized expression in NativeExpressionOptimizer#27143

Open
pramodsatya wants to merge 2 commits intoprestodb:masterfrom
pramodsatya:fix_lambda
Open

fix(native): Replace lambda body with optimized expression in NativeExpressionOptimizer#27143
pramodsatya wants to merge 2 commits intoprestodb:masterfrom
pramodsatya:fix_lambda

Conversation

@pramodsatya
Copy link
Contributor

@pramodsatya pramodsatya commented Feb 13, 2026

Description

Fixes a bug in NativeExpressionOptimizer.ReplacingVisitor#visitLambda where an optimized lambda returned by the native sidecar is not handled correctly. The visitor currently checks replacement-eligibility only on the lambda body and then applies a body-level replacement. However, the complete lambda expression is sent to the sidecar for optimization and not just the body of lambda expression. This causes the optimizer to skip rewriting lambdas with optimized body, leaving unoptimized subtrees.
This fix makes the visitor check and replace the entire LambdaDefinitionExpression with a new LambdaDefinitionExpression containing the replacement's body. This allows for any further rewrites to be applied to the optimized body before constructing the final LambdaDefinitionExpression (See #27122).

Motivation and Context

The native sidecar optimizer returns optimized lambda expressions where only the lambda body is optimized. The ReplacingVisitor logic only checks if lambda body can be replaced and then applies the replacement. The CollectingVisitor logic however gathers the complete LambdaDefinitionExpression for optimization via the sidecar and not just the lambda's body.

Impact

Not user-facing, correctness fix in NativeExpressionOptimizer that applies at planning stage for C++ clusters.

Test Plan

Added e2e tests.

== NO RELEASE NOTE ==

Summary by Sourcery

Improve native sidecar expression optimization for lambda expressions by correctly optimizing the lambda body after applying the resolver, and add coverage for lambda constant folding.

Enhancements:

  • Adjust lambda optimization to operate on the resolver-produced lambda expression and propagate the optimized body into the returned lambda definition.

Tests:

  • Add NativeExpressionOptimizer test harness and tests verifying constant folding within lambda bodies in transform expressions.

Summary by Sourcery

Fix native sidecar lambda optimization to operate on the full lambda expression returned by the resolver and add targeted test coverage for lambda constant folding in transform expressions.

Bug Fixes:

  • Correct handling of sidecar-optimized lambda expressions by replacing the entire lambda definition and re-optimizing the returned lambda body.

Build:

  • Add presto-analyzer as a test-scoped dependency for native sidecar expression optimizer tests.

Tests:

  • Introduce a NativeExpressionOptimizer test harness using the sidecar plugin and add tests validating constant folding within lambda bodies in transform() expressions.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 13, 2026
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 13, 2026

Reviewer's Guide

Adjusts NativeExpressionOptimizer's lambda handling to replace and re-optimize the entire LambdaDefinitionExpression returned by the sidecar, and adds an integration-style test harness plus tests verifying constant folding in lambda bodies, along with a test dependency update.

Sequence diagram for updated lambda optimization in NativeExpressionOptimizer

sequenceDiagram
    participant Planner
    participant NativeExpressionOptimizer
    participant ReplacingVisitor
    participant Resolver
    participant Sidecar

    Planner->>NativeExpressionOptimizer: optimize(rowExpression)
    NativeExpressionOptimizer->>ReplacingVisitor: visitExpression(originalExpression)

    ReplacingVisitor->>ReplacingVisitor: visitLambda(lambda)
    ReplacingVisitor->>ReplacingVisitor: canBeReplaced(lambda)
    alt lambda_can_be_replaced
        ReplacingVisitor->>Resolver: apply(lambda)
        Resolver->>Sidecar: optimize(lambda)
        Sidecar-->>Resolver: optimizedLambda
        Resolver-->>ReplacingVisitor: optimizedLambda

        ReplacingVisitor->>ReplacingVisitor: optimizedBody = optimizedLambda.getBody()
        ReplacingVisitor->>ReplacingVisitor: optimizedBody.accept(this)
        ReplacingVisitor->>ReplacingVisitor: toRowExpression(lambda.sourceLocation, optimizedBody, optimizedBody.type)
        ReplacingVisitor-->>NativeExpressionOptimizer: new LambdaDefinitionExpression(..., optimizedBody)
    else lambda_not_replaced
        ReplacingVisitor-->>NativeExpressionOptimizer: lambda
    end

    NativeExpressionOptimizer-->>Planner: optimizedRowExpression
Loading

Class diagram for updated lambda handling in NativeExpressionOptimizer

classDiagram
    class NativeExpressionOptimizer {
        +RowExpression optimize(RowExpression originalExpression)
        -ReplacingVisitor replacingVisitor
    }

    class ReplacingVisitor {
        +RowExpression visitExpression(RowExpression originalExpression, Void context)
        +RowExpression visitLambda(LambdaDefinitionExpression lambda, Void context)
        -boolean canBeReplaced(RowExpression expression)
        -RowExpression toRowExpression(OptionalSourceLocation sourceLocation, RowExpression body, Type type)
        -Function<RowExpression, RowExpression> resolver
    }

    class RowExpression {
        +Type getType()
    }

    class LambdaDefinitionExpression {
        +OptionalSourceLocation getSourceLocation()
        +List<Type> getArgumentTypes()
        +List<VariableReferenceExpression> getArguments()
        +RowExpression getBody()
    }

    class VariableReferenceExpression {
        +String getName()
        +Type getType()
    }

    class Type {
    }

    class OptionalSourceLocation {
    }

    class Function_RowExpression_RowExpression_ {
        +RowExpression apply(RowExpression expression)
    }

    NativeExpressionOptimizer --> ReplacingVisitor : uses
    ReplacingVisitor ..|> RowExpressionVisitor : implements
    ReplacingVisitor --> Function_RowExpression_RowExpression_ : has_resolver
    ReplacingVisitor --> RowExpression : operates_on
    ReplacingVisitor --> LambdaDefinitionExpression : visits
    LambdaDefinitionExpression --> RowExpression : has_body
    LambdaDefinitionExpression --> VariableReferenceExpression : has_arguments
    LambdaDefinitionExpression --> Type : has_argumentTypes
    RowExpression --> Type : has_type
    LambdaDefinitionExpression --> OptionalSourceLocation : has_sourceLocation
Loading

File-Level Changes

Change Details Files
Fix lambda optimization to operate on the entire lambda expression and recursively optimize the sidecar-returned body before constructing the final LambdaDefinitionExpression.
  • Change replacement-eligibility check from the lambda body to the full LambdaDefinitionExpression.
  • Invoke resolver on the full lambda and cast the result back to LambdaDefinitionExpression to obtain the optimized body.
  • Re-run the NativeExpressionOptimizer visitor on the optimized lambda body to apply further rewrites before rebuilding the lambda.
  • Construct the new LambdaDefinitionExpression using the original source location, argument types, and arguments, but with the fully optimized body and its resulting type.
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizer.java
Introduce a NativeExpressionOptimizer test harness and end-to-end tests validating lambda-body constant folding via the native sidecar, plus required test dependency wiring.
  • Add presto-analyzer as a test-scoped dependency to support SQL parsing and analysis in tests.
  • Create TestNativeExpressionOptimizer that boots a NativeSidecarPluginQueryRunner and wires up metadata, type manager, and JSON/thrift codecs.
  • Add testLambdaBodyConstantFolding to assert that transform() lambda bodies are constant-folded (e.g., arithmetic, casts, and json_parse-based expressions).
  • Provide helper methods to translate SQL to RowExpression, invoke NativeExpressionOptimizer at the OPTIMIZED level, and assert semantic equivalence of optimized expressions.
presto-native-sidecar-plugin/pom.xml
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionOptimizer.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@pramodsatya pramodsatya force-pushed the fix_lambda branch 2 times, most recently from ff22724 to fbd4872 Compare February 16, 2026 22:05
@pramodsatya pramodsatya marked this pull request as ready for review February 16, 2026 22:06
@pramodsatya pramodsatya requested review from a team and pdabre12 as code owners February 16, 2026 22:06
Copilot AI review requested due to automatic review settings February 16, 2026 22:06
@pramodsatya pramodsatya requested a review from a team as a code owner February 16, 2026 22:06
@prestodb-ci prestodb-ci requested review from a team and NivinCS and removed request for a team and Copilot February 16, 2026 22:06
@pramodsatya pramodsatya requested review from a team, aditi-pandit, Copilot and tdcmeehan and removed request for NivinCS February 16, 2026 22:06
@pramodsatya
Copy link
Contributor Author

@aditi-pandit, @tdcmeehan, @pdabre12 could you please help review this fix?

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In visitLambda, the resolver.apply(lambda) result is blindly cast to LambdaDefinitionExpression; consider validating the type or failing fast with a clear error message if the resolver ever returns a non-lambda to avoid unexpected ClassCastExceptions.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `visitLambda`, the `resolver.apply(lambda)` result is blindly cast to `LambdaDefinitionExpression`; consider validating the type or failing fast with a clear error message if the resolver ever returns a non-lambda to avoid unexpected `ClassCastException`s.

## Individual Comments

### Comment 1
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionOptimizer.java:86-94` </location>
<code_context>
+        closeAllRuntimeException(queryRunner);
+    }
+
+    @Test
+    public void testLambdaBodyConstantFolding()
+    {
+        assertOptimizedEquals("transform(ARRAY[unbound_long, unbound_long2], x -> 1 + 1)",
</code_context>

<issue_to_address>
**suggestion (testing):** Add coverage for non-optimizable and partially optimizable lambda bodies to ensure the visitor remains a no-op when appropriate

Currently `testLambdaBodyConstantFolding` only covers fully constant-foldable lambda bodies. Please add tests for:

- A non-foldable lambda body (e.g., `x -> x + unbound_long`), asserting the optimized expression remains structurally equivalent to the input.
- A partially foldable body (e.g., `x -> x + (1 + 1)`), asserting only the constant part is folded and the lambda/argument wiring is preserved.

This will help catch regressions where the replacement logic incorrectly rewrites lambdas or drops arguments.

```suggestion
    @Test
    public void testLambdaBodyConstantFolding()
    {
        // Fully constant-foldable lambda bodies
        assertOptimizedEquals("transform(ARRAY[unbound_long, unbound_long2], x -> 1 + 1)",
                "transform(ARRAY[unbound_long, unbound_long2], x -> 2)");
        assertOptimizedEquals("transform(ARRAY[unbound_long, unbound_long2], x -> cast('123' AS integer))",
                "transform(ARRAY[unbound_long, unbound_long2], x -> 123)");
        assertOptimizedEquals("transform(ARRAY[unbound_long, unbound_long2], x -> cast(json_parse('[1, 2]') AS ARRAY<INTEGER>)[1] + 1)",
                "transform(ARRAY[unbound_long, unbound_long2], x -> 2)");

        // Non-foldable lambda body: ensure lambda wiring and arguments are preserved
        assertOptimizedEquals(
                "transform(ARRAY[unbound_long, unbound_long2], x -> x + unbound_long)",
                "transform(ARRAY[unbound_long, unbound_long2], x -> x + unbound_long)");

        // Partially foldable lambda body: only constant part should be folded
        assertOptimizedEquals(
                "transform(ARRAY[unbound_long, unbound_long2], x -> x + (1 + 1))",
                "transform(ARRAY[unbound_long, unbound_long2], x -> x + 2)");
    }
```
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionOptimizer.java:89-92` </location>
<code_context>
+    @Test
+    public void testLambdaBodyConstantFolding()
+    {
+        assertOptimizedEquals("transform(ARRAY[unbound_long, unbound_long2], x -> 1 + 1)",
+                "transform(ARRAY[unbound_long, unbound_long2], x -> 2)");
+        assertOptimizedEquals("transform(ARRAY[unbound_long, unbound_long2], x -> cast('123' AS integer))", "transform(ARRAY[unbound_long, unbound_long2], x -> 123)");
+        assertOptimizedEquals("transform(ARRAY[unbound_long, unbound_long2], x -> cast(json_parse('[1, 2]') AS ARRAY<INTEGER>)[1] + 1)",
+                "transform(ARRAY[unbound_long, unbound_long2], x -> 2)");
+    }
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for nested lambdas and multiple lambda occurrences in the same expression

Since the bug involved replacing entire `LambdaDefinitionExpression` nodes, it would be good to also cover:

- A lambda whose body contains another lambda (e.g. `transform(..., x -> transform(ARRAY[1,2], y -> 1 + 1))`).
- Multiple lambdas in a single expression (e.g. nested `transform` calls, or `transform` plus `filter` each with their own lambda).

This would verify that the visitor correctly traverses and normalizes all optimized lambda bodies, even when nested or repeated.

Suggested implementation:

```java
    @Test
    public void testLambdaBodyConstantFolding()
    {
        assertOptimizedEquals(
                "transform(ARRAY[unbound_long, unbound_long2], x -> 1 + 1)",
                "transform(ARRAY[unbound_long, unbound_long2], x -> 2)");

        assertOptimizedEquals(
                "transform(ARRAY[unbound_long, unbound_long2], x -> cast('123' AS integer))",
                "transform(ARRAY[unbound_long, unbound_long2], x -> 123)");

        assertOptimizedEquals(
                "transform(ARRAY[unbound_long, unbound_long2], x -> cast(json_parse('[1, 2]') AS ARRAY<INTEGER>)[1] + 1)",
                "transform(ARRAY[unbound_long, unbound_long2], x -> 2)");
    }

    @Test
    public void testNestedLambdaBodyConstantFolding()
    {
        assertOptimizedEquals(
                "transform(ARRAY[unbound_long, unbound_long2], x -> transform(ARRAY[1, 2], y -> 1 + 1))",
                "transform(ARRAY[unbound_long, unbound_long2], x -> transform(ARRAY[1, 2], y -> 2))");
    }

    @Test
    public void testMultipleLambdaOccurrencesConstantFolding()
    {
        assertOptimizedEquals(
                "filter(transform(ARRAY[unbound_long, unbound_long2], x -> 1 + 1), x -> x > 1 + 1)",
                "filter(transform(ARRAY[unbound_long, unbound_long2], x -> 2), x -> x > 2)");

```

- Ensure the closing brace for `testMultipleLambdaOccurrencesConstantFolding` is correctly followed by the next method or class-level brace in the file (the snippet you provided is truncated, so adjust brace placement if needed).
- No additional imports should be necessary if `assertOptimizedEquals`, `transform`, and `filter` are already used elsewhere in this test class; otherwise, align with existing helper methods and function usage patterns in this file.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the NativeExpressionOptimizer where lambda expressions returned by the native sidecar were not being correctly optimized. The issue was that the visitor was checking replacement eligibility only on the lambda body, but the complete lambda expression is sent to the sidecar.

Changes:

  • Modified ReplacingVisitor#visitLambda to check and replace the entire LambdaDefinitionExpression instead of just the body
  • Added comprehensive test coverage for lambda body constant folding scenarios
  • Added presto-analyzer test dependency

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
NativeExpressionOptimizer.java Updated lambda visitor to handle complete lambda expression optimization and apply recursive rewrites to optimized body
TestNativeExpressionOptimizer.java Added new test class with harness and tests for lambda constant folding optimization
pom.xml Added presto-analyzer test dependency for test infrastructure

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@pdabre12
Copy link
Contributor

@pramodsatya Why do we need the new test file? Can we just add the new test cases to TestNativeExpressionInterpreter?

@pramodsatya
Copy link
Contributor Author

@pdabre12, the NativeSidecarExpressionInterpreter from TestNativeExpressionInterpreter does not invoke the codepath where the fix is made. TestNativeExpressionOptimizer invokes the optimize API via the NativeExpressionOptimizer to validate the fix in ReplacingVisitor.

Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pramodsatya : Have one main question about the test. Will look into more detail after that.

}

@Test
public void testLambdaBodyConstantFolding()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have constant folding tests elsewhere in the code as well, right ? Am curious why you have this new test class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aditi-pandit, currently we have tests for constant folding with sidecar in TestNativeExpressionInterpreter and in TestNativeSidecarPlugin.testNativeExpressionOptimizer(). This bug is in NativeExpressionOptimizer so it cannot be validated with TestNativeExpressionInterpreter.
Using a test assertion such as

protected void assertQueryWithSameQueryRunner(Session session, @Language("SQL") String actual, @Language("SQL") String expected)

in the testcase TestNativeSidecarPlugin.testNativeExpressionOptimizer(), only verifies that the result of evaluating both SQL queries is the same, and it cannot be used to verify the expression tree is constant folded per expectations.
So the new test class is needed here. Please let me know if I am missing something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pramodsatya

@pdabre12 : Can you do a first round of review, especially for the java test code ? The C++ code looks fine.

Copy link
Contributor

@pdabre12 pdabre12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pramodsatya.
I am fine with the extra test file % one change.

return translator.translate(parsedExpression, SYMBOL_TYPES);
}

private NativeExpressionOptimizer getNativeExpressionOptimizer(FunctionAndTypeManager functionAndTypeManager, NodeManager nodeManager)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see most of this code is repeated in TestNativeExpressionInterpreter#getRowExpressionInterpreter.
Can we extract the common code in a function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create a common function for the injector and then get the needed instance accordingly.

@pdabre12
Copy link
Contributor

pdabre12 commented Feb 18, 2026

@pramodsatya You need to exclude the new test class file under this profile to address the CI failure. https://github.com/prestodb/presto/blob/master/presto-native-sidecar-plugin/pom.xml#L308

@pramodsatya
Copy link
Contributor Author

@pramodsatya You need to exclude the new test class file under this profile to address the CI failure. https://github.com/prestodb/presto/blob/master/presto-native-sidecar-plugin/pom.xml#L308

Thanks @pdabre12, updated accordingly, could you PTAL?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants