feat: Support decimal type for the Spark checked_add and checked_subtract functions#16302
feat: Support decimal type for the Spark checked_add and checked_subtract functions#16302n0r0shi wants to merge 10 commits intofacebookincubator:mainfrom
Conversation
|
Hi @kaishu-dev! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
✅ Deploy Preview for meta-velox canceled.
|
e3d80dc to
b22b85d
Compare
9b2aa0e to
bbe0e86
Compare
rui-mo
left a comment
There was a problem hiding this comment.
Thanks for this work! Please also update the relevant documentation.
4181b00 to
22c7417
Compare
|
Updated the doc. Unchecked add and subtract were missing, so I added them as well. Please let me know if there's anything else needed for this PR. Thanks! @rui-mo |
Add checked decimal add and subtract functions that throw on overflow instead of returning null. These are needed for Spark's ANSI mode where arithmetic overflow should raise an error rather than silently produce null.
22c7417 to
8805bc2
Compare
rui-mo
left a comment
There was a problem hiding this comment.
Thanks for all your efforts!
| The allow-precision-loss flag applies to both regular and checked (ANSI mode) arithmetic functions. | ||
| In Spark, there are no separate checked expression classes. The same expression (e.g., ``Add``) | ||
| handles both ANSI and non-ANSI behavior, controlled by an ``EvalMode`` flag. In Velox, the checked | ||
| variants are registered as separate functions (e.g., ``checked_add``, ``checked_subtract``). |
There was a problem hiding this comment.
nit: perhaps mention that they are registered as separate functions to support the TRY evaluation mode.
There was a problem hiding this comment.
@n0r0shi Thanks. There are some workflow failures:
https://github.com/facebookincubator/velox/actions/runs/22347057553/job/64672777202?pr=16302 and format issues. Would you please take a look?
There was a problem hiding this comment.
I just added a fix in 28d89e9. I will check the next run, thanks.
|
Fixed the format issue and |
|
There are 3 failures at this point, as far as I checked this can be related to #16543
which looks relevant to #16510 So it doesn't look like my PR has introduced these failures. Please let me know otherwise @rui-mo |
| std::optional<T> t, | ||
| std::optional<U> u) { | ||
| return evaluateOnce<int128_t>( | ||
| "checked_subtract(c0, c1)", {tType, uType}, t, u); |
There was a problem hiding this comment.
Please also review this comment: #16307 (comment), and consider adding try_* tests.
There was a problem hiding this comment.
Added try_* tests in 69afcb2. Also consolidated the default and denyPrecisionLoss cases — they now share a common helper that tests the same core scenarios, while denyPrecisionLoss retains additional test cases for its specific behavior.
|
Some checks failed but they don't appear related to my change — do I need to take any action? |
Yeah the failures are for the newly added tests failing in debug mode We also have another internal CI that detects build speed regression for sparksql/ directory that failed. @kgpai any thoughts for that? |
|
It looks like the addition of whole parts at this line is undefined behavior when the sum overflows I can wrap this with Does that look right to you? @kKPulla |
|
@n0r0shi @rui-mo The test failures seem to have been fixed but the build speed regression is still there. Looking further into the change, it is very likely because of the templatizing we are doing in this PR. If you have any ways to optimize build speed for sparksql functions, would appreciate if you can look into it as we are trying to reduce build speeds generally in Velox. cc: @kgpai |
|
Thanks for looking into this, @kKPulla @kgpai. The regression comes from 20 new These checked functions are necessary for Spark ANSI and TRY mode support. Build time improvements are worth pursuing separately at the framework level, but I don't think individual feature work should be blocked by costs inherent to the existing registration pattern. Let me know how you'd like to proceed. |
|
@kKPulla Lets go ahead and merge this and pursue build speed improvements as a separate overall topic. |
Routes decimal `Add` and `Subtract` to Velox's `checked_add` and `checked_subtract` functions when ANSI mode is enabled (`nullOnOverflow = false`). These checked variants throw on overflow instead of returning null, matching Spark's ANSI behavior. Depends on facebookincubator/velox#16302 which adds `checked_add` and `checked_subtract` support for decimal types.
Resolve conflicts with merged checked_add/checked_subtract (PR facebookincubator#16302). Update multiply tests to use generic checkedDecimalArithmetic helpers.
Summary
checked_addandchecked_subtractfor decimal types that throw onoverflow instead of returning null, enabling Spark ANSI mode support for
decimal arithmetic
Test plan
long+short, long+long)
Closes #16301