Skip to content

Fix test_hash_functions failures on big-endian architectures#1130

Open
ijatinydv wants to merge 2 commits intofortran-lang:masterfrom
ijatinydv:fix-hash-big-endian
Open

Fix test_hash_functions failures on big-endian architectures#1130
ijatinydv wants to merge 2 commits intofortran-lang:masterfrom
ijatinydv:fix-hash-big-endian

Conversation

@ijatinydv
Copy link
Contributor

Resolves Debian build failures on big-endian architectures (s390x, powerpc, sparc64) reported in issue #1128

The Issue:
The test_hash_functions suite was failing on big-endian (BE) machines. The root cause was a mismatch in design philosophies: the Fortran implementations of nmhash and waterhash explicitly normalize byte reads to Little-Endian to ensure consistent hashes across platforms, while the vendored C reference implementations (used as test oracles) rely on native memory layouts. Because of this, the C tests were generating different hashes on BE and causing the assertions to fail.

pengyhash and SpookyV2 use native reads in both C and Fortran, so they were already perfectly aligned on BE.

Changes in this PR:

  • Test Infrastructure: Replaced the hard error stop in test_little_endian with skip_test on BE systems so the rest of the suite can actually run.
  • C Reference Patches (Test-only): Patched waterhash.h (added __BYTE_ORDER__ swaps) and nmhash.h (fixed the 16-bit union multiplication pairing) so the C test code accurately mirrors the Fortran LE-normalization on BE platforms.
  • Self-Consistency Tests: Added two new pure-Fortran tests (test_hash_determinism and test_hash_distribution) that run on all platforms. This ensures our Fortran algorithms aren't mathematically degenerate on BE, independent of the C reference code.
  • Docs: Added a small note to the API docs warning users that pengyhash and SpookyV2 produce endian-dependent output and shouldn't be used for cross-architecture verification.

Since all C-level changes are guarded by preprocessor macros (#if defined(__BYTE_ORDER__) and #if NMHASH_LITTLE_ENDIAN), this is guaranteed to be a zero-regression change on x86/ARM hardware.

Closes #1128

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ijatinydv for this fix.

I wonder if it would be possible to test on BE platforms in the CI/CD?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes test_hash_functions failures on big-endian architectures by aligning the C reference/oracle behavior with the Fortran implementations (which normalize some reads to little-endian), allowing the suite to run and validate correctly across BE/LE platforms.

Changes:

  • Patch C reference headers (waterhash.h, nmhash.h) to produce LE-normalized results on big-endian systems.
  • Change the little_endian test to skip (instead of fail) on BE so the rest of the suite can execute.
  • Add two pure-Fortran “self-consistency” tests (determinism + basic distribution) and document endian-dependent hashes in the API docs.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
test/hash_functions/waterhash.h Adds BE byte-swaps for 16/32-bit reads to mirror Fortran LE normalization.
test/hash_functions/nmhash.h Adjusts scalar 16-bit lane multiplies / packing to be endian-correct in the C oracle.
test/hash_functions/test_hash_functions.f90 Skips little-endian-only test on BE; adds platform-independent determinism/distribution tests.
doc/specs/stdlib_hash_procedures.md Documents which hashes are endian-dependent vs cross-architecture stable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 215 to 228
!> Test that different inputs produce different hashes (basic avalanche)
!> This test runs on ALL platforms (LE and BE)
subroutine test_hash_distribution(error)
!> Error handling
type(error_type), allocatable, intent(out) :: error

integer(int8) :: key_a(8), key_b(8)
integer(int32) :: h32_a, h32_b
integer(int64) :: h64_a, h64_b

key_a = [1_int8, 2_int8, 3_int8, 4_int8, &
5_int8, 6_int8, 7_int8, 8_int8]
key_b = [1_int8, 2_int8, 3_int8, 4_int8, &
5_int8, 6_int8, 7_int8, 9_int8] ! differs in last byte
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says this is a "basic avalanche" test, but the implementation only checks hash(a) /= hash(b) for one 1-byte difference and does not include spooky_hash. Either reword the comment to match what’s being tested (a minimal collision/sanity check), or expand the test to cover the intended avalanche/distribution behavior and include spooky_hash as well.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Feb 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.44%. Comparing base (7ca8b88) to head (fd5c09f).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1130      +/-   ##
==========================================
+ Coverage   68.31%   68.44%   +0.12%     
==========================================
  Files         399      399              
  Lines       12788    12833      +45     
  Branches     1383     1391       +8     
==========================================
+ Hits         8736     8783      +47     
+ Misses       4052     4050       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ijatinydv
Copy link
Contributor Author

Thank you @ijatinydv for this fix.

I wonder if it would be possible to test on BE platforms in the CI/CD?

Yes, we can definitely do this! I looked into it, and the best approach would be to use the uraimo/run-on-arch-action to spin up an s390x (IBM Z) container using QEMU emulation.

The only catch is that QEMU is pretty slow—usually 5 to 10 times slower than running natively. To keep it from bottlenecking our CI, we'd need to optimize it a bit. I'd suggest:

  • Scoping it down: Only compiling and running the test_hash_functions target, disabling BLAS, and lowering the maximum rank to speed up the build.
  • Running it separately: Putting it in its own workflow that only triggers nightly or on merges to main. This way, it doesn't block regular PRs or burn through GitHub Actions minutes.

Since setting up and tuning the QEMU CI might take a little trial and error, I'd be happy to tackle it in a separate follow-up PR so it doesn't delay this fix for the Debian builds. However, I'm completely open to whatever you think is best! Just let me know how you'd like to proceed.

@ijatinydv
Copy link
Contributor Author

@jvdp1 Just pushed a quick commit to address the AI review catches! I made the big-endian macros more portable across different compilers, added the missing spooky_hash check, and added a quick note clarifying the math behind the collision sanity test. Everything should be fully ready to go now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stdlib fails test_hash_functions on big--endian systems

3 participants