Skip to content

Few data_parallel tests have slight PCC drop in tt-torch on Aug23 #1195

@kmabeeTT

Description

@kmabeeTT

From nightly run on a branch today based on main w/o changes (link) - 3 data parallel PCC fails:

FAILED tests/models/hardnet/test_hardnet.py::test_hardnet[data_parallel-full-eval] - AssertionError: PCC check failed. Required: 0.98, Got lowest pcc 0.9792845616761784
Detail:
	output 0:0.9792845616761784 [req > 0.98]
FAILED tests/models/deit/test_deit.py::test_deit[data_parallel-full-eval] - AssertionError: PCC check failed. Required: 0.97, Got lowest pcc 0.9563501727850022
Detail:
	output 0:0.9563501727850022 [req > 0.97]
FAILED tests/models/whisper/test_whisper.py::test_whisper[data_parallel-full-eval] - AssertionError: PCC check failed. Required: 0.99, Got lowest pcc 0.9663857527192973
Detail:
	output 1:0.9663857527192973 [req > 0.99]
====== 3 failed, 26 passed, 1 skipped, 126 warnings in 3351.47s (0:55:51) ======

All 3 were seeing 0.98+ yesterday at 8e7327e, but now 0.979 today after tt-xla uplift somehow at de2fd6f

Going to decrease or disable PCC checking in PR today for these and tag this ticket.

For some reason not able to repro locally on aus-wh-07 at same commit, this passes.

pytest -vv  tests/models/hardnet/test_hardnet.py::test_hardnet[data_parallel-full-eval] tests/models/deit/test_deit.py::test_deit[data_parallel-full-eval] tests/models/whisper/test_whisper.py::test_whisper[data_parallel-full-eval] --junit-xml=three_dp_tests.xml |& tee three_dp_tests.log

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions