GH-40911: [C++] Remove decimal division's precision and scale calculate logic from implicit casts by ZhangHuiGui · Pull Request #40969 · apache/arrow

ZhangHuiGui · 2024-04-03T07:55:23Z

Rationale for this change

Fix decimal divide results wrong when scale1 >= scale2 between CallFunction and expression execute.

What changes are included in this PR?

Remove decimal division's precision and scale calculate logic from implicit casts.

Are these changes tested?

Yes

Are there any user-facing changes?

Yes, decimal divide with scale1>=scale2 in ExecuteScalarExpression's results will be different with before.

GitHub Issue: [C++] Decimal divide promotion rule output wrong precision and scale in expression Bind when scale1 >= scale2 #40911

github-actions · 2024-04-03T07:55:55Z

⚠️ GitHub issue #40911 has been automatically assigned in GitHub to PR creator.

Co-authored-by ZhangHuiGui <2689496754@qq.com> Co-authored-by laotan332 <qinpengxiang@outlook.com>

ZhangHuiGui · 2024-04-03T09:54:09Z

@lidavidm @bkietz I'm not understand the previous decimal division rules.

We will cast the decimal type in CastBinaryDecimalArgs before execute.

arrow/cpp/src/arrow/compute/kernels/codegen_internal.cc

Lines 435 to 437 in be3b789

    
           case DecimalPromotion::kDivide: { 
        
             left_scaleup = std::max(4, s1 + p2 - s2 + 1) + s2 - s1; 
        
             right_scaleup = 0;

And the reason why we need change input-types' scale and precision here is that the division operation of BasicDecimal needs to cast decimal1-array to the maximum precision supported by the current operation and then perform division? (Below codes shows that we will cast the args[0] to the precision and scale which fetched from CastBinaryDecimalArgs)

arrow/cpp/src/arrow/compute/function.cc

Lines 245 to 247 in be3b789

    
           if (in_type != args[i].type()) { 
        
             ARROW_ASSIGN_OR_RAISE(arg, Cast(args[i], CastOptions::Safe(in_type), ctx)); 
        
           }

…47445) ### Rationale for this change The issues in #40911 and #39875 are the same: we have a fundamental defect when dispatching kernels for decimal division. The following statements assume both the dividend and the divisor are of the same decimal type (`Decimal32/64/128/256`), with possibly different `(p, s)`. * When doing `DispatchBest`, which is directly invoked through `CallFunction("divide", ...)`, w/o trying `DispatchExact` ahead, the dividend is ALWAYS promoted and the result will have the same `(p, s)` as the dividend, according to the rule listed in our documentation [1] (this is actually adopting the Redshift one [2]). * When doing `DispatchExact`, which is first tried by expression evaluation, there will be a match w/o any promotions so the subsequent try of `DispatchMatch` won't happen. The issue is obvious - `DispatchExact` and `DispatchBest` are conflicting - one saying "OK, for any `decimal128(p1, s1) / decimal128(p2, s2)`, it is a match" and the other saying "No, we must promote the dividend according to `(p1, s1)` and `(p2, s2)`". Then we actually have two choices to fix it: 1. Consider `DispatchBest` is doing the right thing (justified by [1]), and NEVER "exact match" any kernel for decimal division. This is what this PR does. The only problem is that we are basically ALWAYS rejecting a kernel to be "exactly matched" - weird, though functionally correct. 2. Consider `DispatchExact` is doing the right thing, and NOT promoting dividend in `DispatchBest`. The kernel is matched only based on their decimal type (not considering their `(p, s)`). And only the result is promoted (this also complies [1]). This is what the other attempting PR #40969 does. But that PR only claims a promoted result type w/o actually promoting the computation (i.e., the memory representation of a decimal needs to be promoted when doing the division) so the result is wrong. Though this is amendable by supporting basic decimal methods like `PromoteAndDivide` that does the promotion of the dividend and the division all together in one run, the modification can be cumbersome - the "scale up" needs to be propagated from the kernel definition all down to the basic decimal primitives. Besides, I assume this may not be as performant as doing batch promotion + batch division. [1] https://docs.aws.amazon.com/redshift/latest/dg/r_numeric_computations201.html#r_numeric_computations201-precision-and-scale-of-computed-decimal-results [2] https://arrow.apache.org/docs/cpp/compute.html#arithmetic-functions ### What changes are included in this PR? Suppress the `DispatchExact` for decimal division. Also, the match constraint `BinaryDecimalScale1GeScale2` introduced in #47297 becomes useless thus gets removed. ### Are these changes tested? Yes. ### Are there any user-facing changes? None. * GitHub Issue: #40911 Authored-by: Rossi Sun <zanmato1984@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>

zanmato1984 · 2025-09-01T13:02:14Z

The issue this pr tries to resolve is fixed by #47445. We can close this now. Thanks for the effort all the same!

github-actions bot added Component: C++ awaiting review Awaiting review labels Apr 3, 2024

fix decimal divide problem

8b5a6c9

Co-authored-by ZhangHuiGui <2689496754@qq.com> Co-authored-by laotan332 <qinpengxiang@outlook.com>

ZhangHuiGui force-pushed the fix-decimal-divide-problem branch from 7b683f9 to 8b5a6c9 Compare April 3, 2024 08:01

pitrou requested a review from bkietz April 4, 2024 08:33

zanmato1984 mentioned this pull request Aug 27, 2025

GH-40911: [C++][Compute] Fix the decimal division kernel dispatching #47445

Merged

zanmato1984 closed this Sep 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-40911: [C++] Remove decimal division's precision and scale calculate logic from implicit casts#40969

GH-40911: [C++] Remove decimal division's precision and scale calculate logic from implicit casts#40969
ZhangHuiGui wants to merge 1 commit intoapache:mainfrom
ZhangHuiGui:fix-decimal-divide-problem

ZhangHuiGui commented Apr 3, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Apr 3, 2024

Uh oh!

ZhangHuiGui commented Apr 3, 2024 •

edited

Loading

Uh oh!

zanmato1984 commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZhangHuiGui commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

github-actions bot commented Apr 3, 2024

Uh oh!

ZhangHuiGui commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zanmato1984 commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZhangHuiGui commented Apr 3, 2024 •

edited

Loading

ZhangHuiGui commented Apr 3, 2024 •

edited

Loading