Fix test_hash_functions failures on big-endian architectures by ijatinydv · Pull Request #1130 · fortran-lang/stdlib

ijatinydv · 2026-02-25T18:37:28Z

Resolves Debian build failures on big-endian architectures (s390x, powerpc, sparc64) reported in issue #1128

The Issue:
The test_hash_functions suite was failing on big-endian (BE) machines. The root cause was a mismatch in design philosophies: the Fortran implementations of nmhash and waterhash explicitly normalize byte reads to Little-Endian to ensure consistent hashes across platforms, while the vendored C reference implementations (used as test oracles) rely on native memory layouts. Because of this, the C tests were generating different hashes on BE and causing the assertions to fail.

pengyhash and SpookyV2 use native reads in both C and Fortran, so they were already perfectly aligned on BE.

Changes in this PR:

Test Infrastructure: Replaced the hard error stop in test_little_endian with skip_test on BE systems so the rest of the suite can actually run.
C Reference Patches (Test-only): Patched waterhash.h (added __BYTE_ORDER__ swaps) and nmhash.h (fixed the 16-bit union multiplication pairing) so the C test code accurately mirrors the Fortran LE-normalization on BE platforms.
Self-Consistency Tests: Added two new pure-Fortran tests (test_hash_determinism and test_hash_distribution) that run on all platforms. This ensures our Fortran algorithms aren't mathematically degenerate on BE, independent of the C reference code.
Docs: Added a small note to the API docs warning users that pengyhash and SpookyV2 produce endian-dependent output and shouldn't be used for cross-architecture verification.

Since all C-level changes are guarded by preprocessor macros (#if defined(__BYTE_ORDER__) and #if NMHASH_LITTLE_ENDIAN), this is guaranteed to be a zero-regression change on x86/ARM hardware.

Closes #1128

jvdp1

Thank you @ijatinydv for this fix.

I wonder if it would be possible to test on BE platforms in the CI/CD?

Copilot

Pull request overview

Fixes test_hash_functions failures on big-endian architectures by aligning the C reference/oracle behavior with the Fortran implementations (which normalize some reads to little-endian), allowing the suite to run and validate correctly across BE/LE platforms.

Changes:

Patch C reference headers (waterhash.h, nmhash.h) to produce LE-normalized results on big-endian systems.
Change the little_endian test to skip (instead of fail) on BE so the rest of the suite can execute.
Add two pure-Fortran “self-consistency” tests (determinism + basic distribution) and document endian-dependent hashes in the API docs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
test/hash_functions/waterhash.h	Adds BE byte-swaps for 16/32-bit reads to mirror Fortran LE normalization.
test/hash_functions/nmhash.h	Adjusts scalar 16-bit lane multiplies / packing to be endian-correct in the C oracle.
test/hash_functions/test_hash_functions.f90	Skips little-endian-only test on BE; adds platform-independent determinism/distribution tests.
doc/specs/stdlib_hash_procedures.md	Documents which hashes are endian-dependent vs cross-architecture stable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

test/hash_functions/test_hash_functions.f90

Copilot · 2026-02-25T18:56:31Z

test/hash_functions/test_hash_functions.f90

+    !> Test that different inputs produce different hashes (basic avalanche)
+    !> This test runs on ALL platforms (LE and BE)
+    subroutine test_hash_distribution(error)
+        !> Error handling
+        type(error_type), allocatable, intent(out) :: error
+
+        integer(int8) :: key_a(8), key_b(8)
+        integer(int32) :: h32_a, h32_b
+        integer(int64) :: h64_a, h64_b
+
+        key_a = [1_int8, 2_int8, 3_int8, 4_int8, &
+                 5_int8, 6_int8, 7_int8, 8_int8]
+        key_b = [1_int8, 2_int8, 3_int8, 4_int8, &
+                 5_int8, 6_int8, 7_int8, 9_int8]  ! differs in last byte


The comment says this is a "basic avalanche" test, but the implementation only checks hash(a) /= hash(b) for one 1-byte difference and does not include spooky_hash. Either reword the comment to match what’s being tested (a minimal collision/sanity check), or expand the test to cover the intended avalanche/distribution behavior and include spooky_hash as well.

test/hash_functions/waterhash.h

codecov · 2026-02-25T19:14:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.44%. Comparing base (7ca8b88) to head (fd5c09f).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1130      +/-   ##
==========================================
+ Coverage   68.31%   68.44%   +0.12%     
==========================================
  Files         399      399              
  Lines       12788    12833      +45     
  Branches     1383     1391       +8     
==========================================
+ Hits         8736     8783      +47     
+ Misses       4052     4050       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ijatinydv · 2026-02-26T02:37:20Z

Thank you @ijatinydv for this fix.

I wonder if it would be possible to test on BE platforms in the CI/CD?

Yes, we can definitely do this! I looked into it, and the best approach would be to use the uraimo/run-on-arch-action to spin up an s390x (IBM Z) container using QEMU emulation.

The only catch is that QEMU is pretty slow—usually 5 to 10 times slower than running natively. To keep it from bottlenecking our CI, we'd need to optimize it a bit. I'd suggest:

Scoping it down: Only compiling and running the test_hash_functions target, disabling BLAS, and lowering the maximum rank to speed up the build.
Running it separately: Putting it in its own workflow that only triggers nightly or on merges to main. This way, it doesn't block regular PRs or burn through GitHub Actions minutes.

Since setting up and tuning the QEMU CI might take a little trial and error, I'd be happy to tackle it in a separate follow-up PR so it doesn't delay this fix for the Debian builds. However, I'm completely open to whatever you think is best! Just let me know how you'd like to proceed.

ijatinydv · 2026-02-26T02:41:03Z

@jvdp1 Just pushed a quick commit to address the AI review catches! I made the big-endian macros more portable across different compilers, added the missing spooky_hash check, and added a quick note clarifying the math behind the collision sanity test. Everything should be fully ready to go now!

Fix test_hash_functions failures on big-endian architectures

d6be3e5

jvdp1 approved these changes Feb 25, 2026

View reviewed changes

jvdp1 requested a review from Copilot February 25, 2026 18:52

Copilot started reviewing on behalf of jvdp1 February 25, 2026 18:52 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

Refine big-endian macros and expand hash collision tests

fd5c09f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test_hash_functions failures on big-endian architectures#1130

Fix test_hash_functions failures on big-endian architectures#1130
ijatinydv wants to merge 2 commits intofortran-lang:masterfrom
ijatinydv:fix-hash-big-endian

ijatinydv commented Feb 25, 2026

Uh oh!

jvdp1 left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Uh oh!

codecov bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

ijatinydv commented Feb 26, 2026

Uh oh!

ijatinydv commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ijatinydv commented Feb 25, 2026

Uh oh!

jvdp1 left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ijatinydv commented Feb 26, 2026

Uh oh!

ijatinydv commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Feb 25, 2026 •

edited

Loading