You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
loosen lmeval assertions to upper or lower bound (#1477)
SUMMARY:
`lm_eval` end-to-end tests are occasionally failing when the actual
value is higher than the expected value, when we really only care about
if performance has regressed (example run
[here](https://github.com/neuralmagic/llm-compressor-testing/actions/runs/15232878962/job/42842981702)).
This PR loosens the check to only assert
- actual value > expected value - error tolerance if higher score is
better (generally the case)
- actual value < expected value + error tolerance if lower score is
better (in case we have PPL checks or add in future)
TEST PLAN:
- [x] Rerun weekly lm-eval tests before merging this in --
https://github.com/neuralmagic/llm-compressor-testing/actions/runs/15281691638
---------
Signed-off-by: Brian Dellabetta <[email protected]>
0 commit comments