-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Labels
Description
From nightly run on a branch today based on main w/o changes (link) - 3 data parallel PCC fails:
FAILED tests/models/hardnet/test_hardnet.py::test_hardnet[data_parallel-full-eval] - AssertionError: PCC check failed. Required: 0.98, Got lowest pcc 0.9792845616761784
Detail:
output 0:0.9792845616761784 [req > 0.98]
FAILED tests/models/deit/test_deit.py::test_deit[data_parallel-full-eval] - AssertionError: PCC check failed. Required: 0.97, Got lowest pcc 0.9563501727850022
Detail:
output 0:0.9563501727850022 [req > 0.97]
FAILED tests/models/whisper/test_whisper.py::test_whisper[data_parallel-full-eval] - AssertionError: PCC check failed. Required: 0.99, Got lowest pcc 0.9663857527192973
Detail:
output 1:0.9663857527192973 [req > 0.99]
====== 3 failed, 26 passed, 1 skipped, 126 warnings in 3351.47s (0:55:51) ======
All 3 were seeing 0.98+ yesterday at 8e7327e, but now 0.979 today after tt-xla uplift somehow at de2fd6f
Going to decrease or disable PCC checking in PR today for these and tag this ticket.
For some reason not able to repro locally on aus-wh-07 at same commit, this passes.
pytest -vv tests/models/hardnet/test_hardnet.py::test_hardnet[data_parallel-full-eval] tests/models/deit/test_deit.py::test_deit[data_parallel-full-eval] tests/models/whisper/test_whisper.py::test_whisper[data_parallel-full-eval] --junit-xml=three_dp_tests.xml |& tee three_dp_tests.log
Reactions are currently unavailable