Skip to content

GH-46739: [C++] Fix Float16 signed zero/NaN equality comparisons#46973

Merged
pitrou merged 15 commits intoapache:mainfrom
benibus:GH-46739-incorrect-float16-compare
Sep 4, 2025
Merged

GH-46739: [C++] Fix Float16 signed zero/NaN equality comparisons#46973
pitrou merged 15 commits intoapache:mainfrom
benibus:GH-46739-incorrect-float16-compare

Conversation

@benibus
Copy link
Copy Markdown
Contributor

@benibus benibus commented Jul 2, 2025

Rationale for this change

Equality comparisons between half-floats (used in their scalar/array Equals methods) do not properly handle EqualOptions::nans_equal and EqualOptions::signed_zeros_equal.

What changes are included in this PR?

  • Internal fixes to the current comparison behavior and additional tests as needed
  • Prevents Float16 NaNs from being randomly generated by test utilities by default (matching behavior for float/double)

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 2, 2025

⚠️ GitHub issue #46739 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 4, 2025

⚠️ GitHub issue #46739 has been automatically assigned in GitHub to PR creator.

1 similar comment
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 5, 2025

⚠️ GitHub issue #46739 has been automatically assigned in GitHub to PR creator.

@benibus benibus marked this pull request as ready for review July 5, 2025 01:27
Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @benibus . This is a useful fix, here are a couple comments and suggestions.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jul 8, 2025
@benibus benibus force-pushed the GH-46739-incorrect-float16-compare branch from 9cb6072 to f6b74b2 Compare July 11, 2025 01:30
@pitrou
Copy link
Copy Markdown
Member

pitrou commented Aug 21, 2025

@benibus Is this ready for review again?

@benibus
Copy link
Copy Markdown
Contributor Author

benibus commented Aug 22, 2025

@pitrou Yes, sorry. Feel free to take another look.

Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, some additional comments below

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't fix GenerateTypedData when nan_probability_ is non-zero (it will use std::numeric_limits<uint16_t>::quiet_NaN() which is 0 and translates to Float16(0.0)).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DistributionType will be std::uniform_int_distribution<uint16_t> which will certainly not respect the min and max values once translated to Float16?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps ValueType needs to be Float16 here to make sure we don't misuse uint16_t like this, and DistributionType could be ::arrow::random::uniform_real_distribution<float>.

Comment on lines 584 to 587
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect these checks in a helper function, especially as we supposedly have unit tests for this already (otherwise we should add them).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use std::decay_t instead of std::remove_reference_t?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably keep the check here by using if constexpr as below?

Comment on lines 295 to 297
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are changing this, perhaps ASSERT_TRUE(set.emplace(...).second) would be better?

@benibus benibus force-pushed the GH-46739-incorrect-float16-compare branch from f6b74b2 to 866f533 Compare September 2, 2025 01:29
@benibus benibus force-pushed the GH-46739-incorrect-float16-compare branch from 866f533 to fb60726 Compare September 4, 2025 01:09
@benibus benibus requested a review from pitrou September 4, 2025 02:05
Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one last thing

Comment on lines +1160 to +1161
ARROW_LOG(INFO) << "min = " << min_value.ToFloat();
ARROW_LOG(INFO) << "max = " << max_value.ToFloat();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably mean to remove these :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks! Just removed them.

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Sep 4, 2025

@github-actions crossbow submit -g cpp

@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 4, 2025

Revision: 116b975

Submitted crossbow builds: ursacomputing/crossbow @ actions-868dcfc65b

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-42-cpp GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Sep 4, 2025

The Valgrind failure is unrelated, see #47496

@pitrou pitrou merged commit caf4f70 into apache:main Sep 4, 2025
39 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Sep 4, 2025
@pitrou
Copy link
Copy Markdown
Member

pitrou commented Sep 4, 2025

Thanks a lot @benibus !

@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit caf4f70.

There weren't enough matching historic benchmark results to make a call on whether there were regressions.

The full Conbench report has more details.

zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Oct 15, 2025
apache#46973)

### Rationale for this change

Equality comparisons between half-floats (used in their scalar/array `Equals` methods) do not properly handle `EqualOptions::nans_equal` and `EqualOptions::signed_zeros_equal`.
 
### What changes are included in this PR?

- Internal fixes to the current comparison behavior and additional tests as needed
- Prevents Float16 NaNs from being randomly generated by test utilities by default (matching behavior for float/double)

### Are these changes tested?

Yes

### Are there any user-facing changes?

No

* GitHub Issue: apache#46739

Authored-by: Benjamin Harkins <benpharkins@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants