-
Couldn't load subscription status.
- Fork 75
Reenable some mxfp tests for XPU #5379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| if (BLOCK_M, BLOCK_N, BLOCK_K) == (128, 256, | ||
| 256) and CONST_SCALE and "GPU Max 1100" in torch.xpu.get_device_name(): | ||
| pytest.skip("XPU Max 1100 does not fit in memory large block size for CONST_SCALE mxfp matmul") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid being tied to the GPU name.
# check maximum shared memory
if triton.runtime.driver.active.utils.get_device_properties(
triton.runtime.driver.active.get_current_device())["max_shared_mem"] <= 196608:
pytest.xfail("XPU: Not enough shared memory")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
This PR lowers the pass rate from 93.9% to 85.36%, IMO the |
|
@whitneywhtsang xfail is not an option at this point, as it will increate the total execution time to many hours. Where is this pass rate drawn from? Perhaps we can filter out those newly skipped matmul tests until the execution time is fixed? |
Addresses #4062