GH-41011: [C++][Compute] Fix the issue that comparison function could not handle decimal arguments with different scales#47459

Merged

zanmato1984 merged 4 commits intoapache:mainfrom

zanmato1984:fix/gh-41011

Sep 4, 2025

Contributor

zanmato1984 commented Aug 29, 2025 •

edited by github-actions bot

Loading

Rationale for this change

We used to be not able to suppress the exact matching for decimal arguments with different scales, when a decimal comparison kernel who actually requires the scales to be the same. This caused issue like #41011.

The "match constraint" introduced in #47297 is exactly for fixing issues like this, by simply adding a proper constraint.

What changes are included in this PR?

Added match constraint that requires all decimal inputs have the same scale (like for decimal addition and subtract).

Are these changes tested?

Yes.

Are there any user-facing changes?

None.

GitHub Issue: [C++] Decimal types array compare with different scales make error results #41011


          Fix the issue that comparison function could not handle decimal argum…

62eab1d

…ents with different scale

github-actions bot added Component: C++ awaiting review labels

zanmato1984 commented

View reviewed changes

cpp/src/arrow/compute/kernels/scalar_compare.cc

                   auto exec = GenerateDecimal<applicator::ScalarBinaryEqualTypes, BooleanType, Op>(id);
-                  DCHECK_OK(
-                      func->AddKernel({InputType(id), InputType(id)}, boolean(), std::move(exec)));
+                  DCHECK_OK(func->AddKernel({InputType(id), InputType(id)}, boolean(), std::move(exec),

Contributor Author

zanmato1984 Aug 29, 2025 •

edited

Loading

Yep, the fix is as simple as this. Thanks to #47297.

cpp/src/arrow/compute/expression_test.cc

Comment on lines 716 to 740

+                  // decimal int
+                  ExpectBindsTo(cmp(field_ref("dec128_3_2"), field_ref("i64")),
+                                cmp(field_ref("dec128_3_2"), cast(field_ref("i64"), decimal128(21, 2))),
+                                /*bound_out=*/nullptr, *exciting_schema);
+                  ExpectBindsTo(cmp(field_ref("i64"), field_ref("dec128_3_2")),
+                                cmp(cast(field_ref("i64"), decimal128(21, 2)), field_ref("dec128_3_2")),
+                                /*bound_out=*/nullptr, *exciting_schema);
+                  // decimal128 decimal256 with different scales
+                  ExpectBindsTo(
+                      cmp(field_ref("dec128_3_2"), field_ref("dec256_5_3")),
+                      cmp(cast(field_ref("dec128_3_2"), decimal256(4, 3)), field_ref("dec256_5_3")),
+                      /*bound_out=*/nullptr, *exciting_schema);
+                  ExpectBindsTo(
+                      cmp(field_ref("dec256_5_3"), field_ref("dec128_3_2")),
+                      cmp(field_ref("dec256_5_3"), cast(field_ref("dec128_3_2"), decimal256(4, 3))),
+                      /*bound_out=*/nullptr, *exciting_schema);
+                  ExpectBindsTo(cmp(field_ref("dec128_5_3"), field_ref("dec256_3_2")),
+                                cmp(cast(field_ref("dec128_5_3"), decimal256(5, 3)),
+                                    cast(field_ref("dec256_3_2"), decimal256(4, 3))),
+                                /*bound_out=*/nullptr, *exciting_schema);
+                  ExpectBindsTo(cmp(field_ref("dec256_3_2"), field_ref("dec128_5_3")),
+                                cmp(cast(field_ref("dec256_3_2"), decimal256(4, 3)),
+                                    cast(field_ref("dec128_5_3"), decimal256(5, 3))),
+                                /*bound_out=*/nullptr, *exciting_schema);

Contributor Author

zanmato1984 Aug 29, 2025

These cases will actually pass w/o this fix. Just added them for intact coverage of all decimal casting paths.

cpp/src/arrow/compute/expression_test.cc Outdated

Comment on lines 742 to 750

+                  // decimal decimal with different scales
+                  ExpectBindsTo(
+                      cmp(field_ref("dec128_3_2"), field_ref("dec128_5_3")),
+                      cmp(cast(field_ref("dec128_3_2"), decimal128(4, 3)), field_ref("dec128_5_3")),
+                      /*bound_out=*/nullptr, *exciting_schema);
+                  ExpectBindsTo(
+                      cmp(field_ref("dec128_5_3"), field_ref("dec128_3_2")),
+                      cmp(field_ref("dec128_5_3"), cast(field_ref("dec128_3_2"), decimal128(4, 3))),
+                      /*bound_out=*/nullptr, *exciting_schema);

Contributor Author

zanmato1984 Aug 29, 2025

These two will fail w/o the fix.

github-actions bot added awaiting committer review and removed awaiting review labels

Contributor Author

zanmato1984 commented Aug 29, 2025

@pitrou pretty easy fix (thanks to #47297), mind to take a look? Thanks.

zanmato1984 requested review from kou and pitrou

September 3, 2025 01:30

zanmato1984 mentioned this pull request

GH-41336: [C++][Compute] Fix case_when kernel dispatch for decimals with different precisions and scales #47479

Merged

zanmato1984 commented

View reviewed changes

cpp/src/arrow/compute/expression_test.cc

               }
               TEST(Expression, BindWithImplicitCasts) {
+                auto exciting_schema = schema(

Contributor Author

zanmato1984 Sep 3, 2025

Get excited one more time! @pitrou

Member

pitrou Sep 3, 2025

My head spins.


          Merge remote-tracking branch 'apache/main' into fix/apachegh-41011

d3aee89

pitrou reviewed

View reviewed changes

cpp/src/arrow/compute/expression_test.cc Outdated

+                                cmp(cast(field_ref("i64"), decimal128(21, 2)), field_ref("dec128_3_2")),
+                                /*bound_out=*/nullptr, *exciting_schema);
+                  // decimal128 decimal256 with different scales

Member

pitrou Sep 3, 2025

What happens with the same scale and different precisions, btw? Ideally no cast would occur since the raw values can be compared directly, but regardless we might add tests for that situation as well?

Contributor Author

zanmato1984 Sep 3, 2025

You are right, no casts applied for different precisions but same scale.

Added the cases. Thank you!

pitrou reviewed

View reviewed changes

cpp/src/arrow/compute/expression_test.cc Outdated Show resolved Hide resolved

pitrou approved these changes

View reviewed changes

Member

pitrou left a comment

LGTM on the principle, some small suggestions

zanmato1984 and others added 2 commits

September 4, 2025 01:02


          Update cpp/src/arrow/compute/expression_test.cc

f2c4c91

Co-authored-by: Antoine Pitrou <pitrou@free.fr>


          Address comment: add cases for different precisions same scale

530b929

Contributor Author

zanmato1984 commented Sep 4, 2025

I'll merge. Thanks for reviewing @pitrou !

zanmato1984 merged commit 2987165 into apache:main

39 checks passed

zanmato1984 removed the awaiting committer review label

This was referenced Sep 4, 2025

[C++] Decimal types array compare with different scales make error results #41011

Closed

GH-41011: [C++] Add an output type resolver for decimal types in CompareFunction so can be casted correctly #41012

Closed

conbench-apache-arrow bot commented Sep 4, 2025

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 2987165.

There weren't enough matching historic benchmark results to make a call on whether there were regressions.

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels