Generates bit-for-bit hashes for each validation driver using EAMxx's bfbhash approach.#504
Generates bit-for-bit hashes for each validation driver using EAMxx's bfbhash approach.#504jeff-cohere merged 4 commits intomainfrom
Conversation
…xx bfbhash approach. This PR uses a new Skywalker feature to traverse all input and output data in an ensemble, creating a unique 64-bit integer hash from all the data that can be used to determine whether answers are bit-for-bit identical, just like is done in EAMxx. Currently, the two hashes are generated for each ensemble: one for inputs and another for outputs. The hashes are printed to stdout. Sample driver output: ``` /home/jeff/projects/sandia/mam4xx/build/src/validation/nucleation/nucleation_driver: reading /home/jeff/projects/sandia/mam4xx/src/validation/mam_x_validation/nucleation/vehkamaki2002_fig8.yaml /home/jeff/projects/sandia/mam4xx/build/src/validation/nucleation/nucleation_driver: writing mam4xx_vehkamaki2002_fig8.py mam4xx hash> exe=nucleation_driver date=2026-01-25-81376 input: 18cabe63e2b9de91 output: d5b105eba0835081 ``` With this in place, we can discuss how we'd like to compare these hashes against baselines so we can understand when we're making non-BFB changes.
|
I think someone more Special than me has to initiate the autotester. But this will keep while we discuss a mam4xx BFB methodology. |
|
I approved the PR to let AT2 know that it’s okay to run the tests. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #504 +/- ##
==========================================
- Coverage 93.31% 93.31% -0.01%
==========================================
Files 301 302 +1
Lines 24486 24542 +56
Branches 2806 2816 +10
==========================================
+ Hits 22849 22901 +52
- Misses 1637 1641 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
odiazib
left a comment
There was a problem hiding this comment.
Thanks for adding this feature.
| #define MAM4xx_BFBHASH_HPP | ||
|
|
||
| // This was copied from Andrew Bradley's original implementation at | ||
| // E3SM/components/eamxx/src/share/util/eamxx_bfbhash.hpp. |
There was a problem hiding this comment.
@bartgol Are there any plans to move this eamxx_bfbhash code to EKAT?
There was a problem hiding this comment.
I don't know, but that might be a good idea.
There was a problem hiding this comment.
It's just one header file, and we'd have to do some gymnastics to avoid the inclusion of MPI on mam4xx standalone builds (or just live with it and incorporate MPI, which is probably fine).
|
Here's an example of how to get the hashes for a given commit from the testing logs (and what it looks like): |
|
Are the other tests in CTest always the same? We could just use vi -d to compare the outputs from master and our branch. |
|
It would be great if we could produce a hash per variable in the output. I don’t know if this is already implemented, but it could be done in a follow-up PR. |
I was thinking about that as an option, too. It would be simple, but more output. Maybe we should merge this one and I can open an issue where we can discuss these options (no BFB hash output, hash per test, hash per output, etc)? |
|
@odiazib , can you re-initiate the autotester? I marked this as Ready for Review. |
We could probably maintain a baseline whose content is the |
I think we should merge this PR and start using this feature, as you recommended. After experimenting with it, we will have a better idea of how to compare against baselines. |
This PR uses a new Skywalker feature to traverse all input and output data in an ensemble,
creating a unique 64-bit integer hash from all the data that can be used to determine
whether answers are bit-for-bit identical, just like is done in EAMxx.
Currently, the two hashes are generated for each ensemble: one for inputs and another for
outputs. The hashes are printed to stdout. Sample driver output:
With this in place, we can discuss how we'd like to compare these hashes against
baselines so we can understand when we're making non-BFB changes. Maybe we can experiment
with this branch to determine (a) if things work as we expect and (b) how best to make use of this
information.
Closes #496