Skip to content

Conversation

@whitneywhtsang
Copy link
Contributor

@whitneywhtsang whitneywhtsang commented Nov 4, 2024

IGCVectorizer of driver agama 1032 has improved. This PR disables the LLVM post processing Triton performed by default, which includes SLPVectorizer.

By disabling LLVM post processing, the performance impact to the 3 key workloads (FA, GEMM, Softmax) are all positive.
For example, GEMM out of box has improved by 16%.
 Screenshot 2024-11-04 184048

CI: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11664443045, https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11674882693

Signed-off-by: Whitney Tsang <[email protected]>
@whitneywhtsang whitneywhtsang self-assigned this Nov 4, 2024
@chengjunlu
Copy link
Contributor

It is good news that we can remove the SLPVectorizer. It is observed that the SLPVectorizer at Triton side may change the order of the operation which make the IR to be more complicate to IGC for optimize.

Changes are LGTM.

@whitneywhtsang whitneywhtsang merged commit eff7b30 into main Nov 5, 2024
5 checks passed
@whitneywhtsang whitneywhtsang deleted the whitneywhtsang/postprocess branch November 5, 2024 02:58
Copy link
Contributor

@etiotto etiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

alexbaden added a commit that referenced this pull request Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants