You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expand CWAI to Keep the Weight scales as Constants (#32232)
### Details:
Performance impact of ~15ms per chunk (16 total chunks per inference) is
seen, netting a E2E inference runtime reduction of ~240ms.
This patch expands CWAI3 to include additional generalized pattern
matching for keeping weight scales as const. Performance benefit is
seen, outlined above. A regression is introduced with this patch for
gaussian_topk_sub and general performance for some ops seemed less
efficient when doing a FW Trace comparison. Additional savings can be
brought in once resolved, tracked in Ticket bellow.
### Tickets:
- [EISW-183592](https://jira.devtools.intel.com/browse/EISW-183592) -
Bug this PR is related to.
- [EISW-185933](https://jira.devtools.intel.com/browse/EISW-185933) -
Bug that PR introduces. Performance benefit is still seen, but larger
benefit will be seen once this issue is resolved.
0 commit comments