Skip to content

Conversation

@HBN-MichalSzy
Copy link
Contributor

@HBN-MichalSzy HBN-MichalSzy commented Oct 23, 2025

Addresses #4062

  • skips same as CUDA
  • skip large block tests for PVC
  • add num_warps=8 for empty kernels case

@HBN-MichalSzy HBN-MichalSzy marked this pull request as ready for review October 24, 2025 07:19
Comment on lines 1224 to 1226
if (BLOCK_M, BLOCK_N, BLOCK_K) == (128, 256,
256) and CONST_SCALE and "GPU Max 1100" in torch.xpu.get_device_name():
pytest.skip("XPU Max 1100 does not fit in memory large block size for CONST_SCALE mxfp matmul")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid being tied to the GPU name.

      # check maximum shared memory
        if triton.runtime.driver.active.utils.get_device_properties(
                triton.runtime.driver.active.get_current_device())["max_shared_mem"] <= 196608:
            pytest.xfail("XPU: Not enough shared memory")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@HBN-MichalSzy HBN-MichalSzy merged commit eff6a02 into main Oct 27, 2025
23 checks passed
@HBN-MichalSzy HBN-MichalSzy deleted the dev/mxfp_matmul_tests branch October 27, 2025 09:58
@whitneywhtsang
Copy link
Contributor

This PR lowers the pass rate from 93.9% to 85.36%, IMO the pytest.skip added in this PR can be changed to pytest.xfail.

@HBN-MichalSzy
Copy link
Contributor Author

@whitneywhtsang xfail is not an option at this point, as it will increate the total execution time to many hours. Where is this pass rate drawn from? Perhaps we can filter out those newly skipped matmul tests until the execution time is fixed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants