[CALCITE-7122] Eliminate nested calls for idempotent unary functions UPPER/LOWER/ABS/INITCAP #4577

xiedeyantu · 2025-10-12T15:49:55Z

I've implemented a rough draft for this JIRA ticket based on my understanding. The actual implementation for idempotent elimination is referenced from the original author. The logic I refactored is as follows:

Allow functions to provide their own idempotent property, instead of maintaining a list of which functions are idempotent (this addresses a point I've been emphasizing).
Handle idempotent functions in an appropriate location, avoiding scattering the idempotent elimination logic across too many places (another point I previously mentioned).
Since the current JIRA describes eliminating unary idempotent functions, I suggest temporarily excluding FLOOR and CEIL from consideration.

I haven't modified the test cases from the original PR #4488 , as I want to use them to verify the functional equivalence of the current code. If this implementation approach is approved, we can establish more specific requirements for test cases in the future.

This is just an idea for your reference.

xiedeyantu · 2025-10-12T15:52:58Z

This is not a ready PR, just a suggestion.

rubenada · 2025-10-12T17:35:33Z

Could be a valid approach. It's aligned with already existing aspects like "deterministic" and "dynamic".
Maybe in the long run we could even consider combining these fields (deterministic, dynamic, idempotent, and any other that may come) into a single flag field instead of N booleans.

xiedeyantu · 2025-10-12T23:00:20Z

Could be a valid approach. It's aligned with already existing aspects like "deterministic" and "dynamic". Maybe in the long run we could even consider combining these fields (deterministic, dynamic, idempotent, and any other that may come) into a single flag field instead of N booleans.

Yes, I think we can continue to integrate or refactor this later. The current PR is mainly intended to share an initial idea. If you'd like me to further improve it, please let me know — I'm not sure if the original author has time to continue working on their PR.

xiedeyantu · 2025-10-13T13:00:37Z

I updated the test case and forced submission. It is currently in the ready state. If the original author continues his work, I will close this PR.

xiedeyantu · 2025-10-13T13:08:28Z

Could be a valid approach. It's aligned with already existing aspects like "deterministic" and "dynamic". Maybe in the long run we could even consider combining these fields (deterministic, dynamic, idempotent, and any other that may come) into a single flag field instead of N booleans.

I have filed a jira CALCITE-7224 to record it.

mihaibudiu · 2025-10-13T16:57:12Z

core/src/main/java/org/apache/calcite/sql/SqlBasicFunction.java

        getReturnTypeInference(), getOperandTypeInference(), operandHandler,
        getOperandTypeChecker(), callValidator,
-        getFunctionType(), monotonicityInference, dynamic);
+        getFunctionType(), monotonicityInference, dynamic, idempotent);


why not keep the constructor version with a default value of 'false' for this argument?
One alternative is to define a new enum { IDEMPOTENT, NONIDEMPOTENT } instead of using boolean.

I didn't quite understand your comment. Here, a similar function to "copy" requires ensuring idempotency with previously set values.

I have several comments.
you have changed lots of constructor invocations.

I added a new constructor and restored the previously modified create method. Is this the modification you meant?

IMO a (new) boolean flag makes more sense considering the already existing flag (deterministic, dynamic). Modifying the constructors is the way to ensure backwards compatibility and avoid breaking consumers.
Whenever we tackle CALCITE-7224, we could deprecate all the constructors using the several flags (with the classic "to be removed before 2.0" comment), and create a single one with a unified flag field.

@rubenada Indeed, using a boolean type aligns with the current design, which is why I used a boolean type in the first version. Using an enum type does break this pattern, but it more clearly indicates whether it is idempotent or non-idempotent. Since I have created a Jira ticket to refactor this boolean value, both approaches seem less critical at this point. I have two solutions: one is to complete the refactoring work in that Jira ticket first and then finish this PR. The other is, if @mihaibudiu agrees to revert to using a boolean type, I can also roll back the code and complete this PR first.

mihaibudiu

This is fine, but many Boolean flags will be hard to manage.
I think having them separate is still necessary, because they are independent.
The only thing I can think of is to give them different artificial enum types.

xiedeyantu · 2025-10-13T23:07:32Z

I have filed a jira CALCITE-7224

@mihaibudiu I have filed a jira CALCITE-7224, and can be processed together with existing deterministic and dynamic follow-ups. The current implementation retains the same style for now.

mihaibudiu · 2025-10-13T23:10:12Z

Yes, I saw that, but that may require changing signatures for these functions...

xiedeyantu · 2025-10-13T23:15:19Z

Yes, I saw that, but that may require changing signatures for these functions...

I'd like to confirm whether you're suggesting that using an enum containing two attributes, IDEMPOTENT and NONIDEMPOTENT, is a better approach in the current implementation, without having to worry about the original implementation style for now? Are you considering refactoring in the future?

mihaibudiu · 2025-10-13T23:17:44Z

Yes, I think that once you make them Boolean, you can't change them to something else.
Let's see if anyone else has a different suggestion.
People may not like a lot of tiny enum classes, but from my experience they are not too bad, and they provide very strong typing.
In general, if you can wrap an abstraction, no matter how small, in a class, I think it's worth wrapping (unless performance is a concern).

xiedeyantu · 2025-10-13T23:23:08Z

Yes, I think that once you make them Boolean, you can't change them to something else. Let's see if anyone else has a different suggestion. People may not like a lot of tiny enum classes, but from my experience they are not too bad, and they provide very strong typing. In general, if you can wrap an abstraction, no matter how small, in a class, I think it's worth wrapping (unless performance is a concern).

I agree with your suggestion. Since refactoring may not happen immediately, we can first implement the new feature to an ideal state. I will later change it to an enum type.

mihaibudiu · 2025-10-14T15:18:02Z

I have changed my mind, I don't think this is an ideal design for two reasons:

it is not sustainable to add flags to all functions for every property that may be interesting
simplify is not the right place for all optimizations; this optimization in particular should be done only once, while simplify runs frequently.
I think the right way to solve this is through a visitor pattern; the visitor will know which functions are idempotent.

xiedeyantu · 2025-10-14T15:32:52Z

I have changed my mind, I don't think this is an ideal design for two reasons:

it is not sustainable to add flags to all functions for every property that may be interesting

simplify is not the right place for all optimizations; this optimization in particular should be done only once, while simplify runs frequently.
I think the right way to solve this is through a visitor pattern; the visitor will know which functions are idempotent.

I don't entirely agree with your perspective. On the contrary, I believe adding possible function properties to SqlOperator is very appropriate. Moreover, if the refactoring is done well, adding future properties will be straightforward. Although simplify may be called frequently, the actual processing might only occur once. However, this optimization point is indeed very niche. If it's deemed unnecessary to implement, I think we could also close this JIRA.

xiedeyantu · 2025-10-14T15:41:36Z

Hi @mihaibudiu, do you also think the function property of idempotency is too niche and not suitable for inclusion as a common property? If that is your point, I agree with your perspective. However, if we set aside its limited applicability, properties like determinism, dynamism, and monotonicity are already maintained as common attributes in this manner, so I believe this approach is reasonable.

mihaibudiu · 2025-10-14T16:05:50Z

It's not really about niche and non-niche, in general the number of algebraic properties you may look for when optimizing is unbounded. It is not sustainable to modify all objects to represent such properties; that's exactly what the visitor pattern is designed to solve.

xiedeyantu · 2025-10-14T17:55:36Z

It's not really about niche and non-niche, in general the number of algebraic properties you may look for when optimizing is unbounded. It is not sustainable to modify all objects to represent such properties; that's exactly what the visitor pattern is designed to solve.

I think we might be discussing how to refactor the function attribute system, which seems unrelated to the purpose of this PR. Since using boolean types as attribute identifiers is already an existing practice, should we accept extending it in the same way? If our final conclusion is not to agree, this PR could be temporarily closed, and the discussion on refactoring function attributes could continue in the new Jira ticket. What do you think?

mihaibudiu · 2025-10-14T18:25:40Z

You can leave the PR open and move the discussion to JIRA.

…UPPER/LOWER/ABS/INITCAP

xiedeyantu · 2025-10-15T12:14:16Z

I have rolled back the code to the boolean-based implementation. If it gains approval, it can be merged. If it doesn’t receive consensus, I will still keep the PR open here, as it aligns with my design.

sonarqubecloud · 2025-10-15T12:23:14Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
97.7% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

xiedeyantu · 2025-10-22T10:04:02Z

This proposed solution was not accepted, so I am closing this issue for now.

xiedeyantu marked this pull request as draft October 12, 2025 15:52

xiedeyantu force-pushed the calcite-7122 branch from 9e521ca to 16f3f6f Compare October 13, 2025 12:57

xiedeyantu marked this pull request as ready for review October 13, 2025 13:00

xiedeyantu force-pushed the calcite-7122 branch from 16f3f6f to b9bde4f Compare October 13, 2025 13:31

mihaibudiu reviewed Oct 13, 2025

View reviewed changes

mihaibudiu approved these changes Oct 13, 2025

View reviewed changes

[CALCITE-7122] Eliminate nested calls for idempotent unary functions …

099d15b

…UPPER/LOWER/ABS/INITCAP

xiedeyantu force-pushed the calcite-7122 branch from d25a942 to 099d15b Compare October 15, 2025 12:08

xiedeyantu added the discussion-in-jira There's open discussion in JIRA to be resolved before proceeding with the PR label Oct 22, 2025

xiedeyantu closed this Oct 22, 2025

[CALCITE-7122] Eliminate nested calls for idempotent unary functions UPPER/LOWER/ABS/INITCAP #4577

[CALCITE-7122] Eliminate nested calls for idempotent unary functions UPPER/LOWER/ABS/INITCAP #4577

Uh oh!

Conversation

xiedeyantu commented Oct 12, 2025

Uh oh!

xiedeyantu commented Oct 12, 2025

Uh oh!

rubenada commented Oct 12, 2025

Uh oh!

xiedeyantu commented Oct 12, 2025

Uh oh!

xiedeyantu commented Oct 13, 2025

Uh oh!

xiedeyantu commented Oct 13, 2025

Uh oh!

mihaibudiu Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

xiedeyantu Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

mihaibudiu Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

xiedeyantu Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

mihaibudiu Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

rubenada Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

xiedeyantu Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

mihaibudiu left a comment

Choose a reason for hiding this comment

Uh oh!

xiedeyantu commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mihaibudiu commented Oct 13, 2025

Uh oh!

xiedeyantu commented Oct 13, 2025

Uh oh!

mihaibudiu commented Oct 13, 2025

Uh oh!

xiedeyantu commented Oct 13, 2025

Uh oh!

mihaibudiu commented Oct 14, 2025

Uh oh!

xiedeyantu commented Oct 14, 2025

Uh oh!

xiedeyantu commented Oct 14, 2025

Uh oh!

mihaibudiu commented Oct 14, 2025

Uh oh!

xiedeyantu commented Oct 14, 2025

Uh oh!

mihaibudiu commented Oct 14, 2025

Uh oh!

xiedeyantu commented Oct 15, 2025

Uh oh!

sonarqubecloud bot commented Oct 15, 2025

Quality Gate passed

Uh oh!

xiedeyantu commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xiedeyantu commented Oct 13, 2025 •

edited

Loading