Skip to content

Conversation

@Rayfxl
Copy link

@Rayfxl Rayfxl commented Dec 7, 2025

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because it fixes a test bug.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

Description:

This commit fixes a test bug where the PTX code was not correctly matching the expected string. The assertion has been updated to reflect the changes introduced by PR #7409 in the Triton repository.

Test environment:
Hardware: AMD 8945HX CPU, NVIDIA RTX 5070Ti Laptop GPU.
Operating System: Ubuntu 20.04, from WSL2.
Triton Version: The latest version from main branch.

Change details:

The original assertion checked for the PTX string 'mma.sync.aligned.m16n8k16.row.col.f32.f16.f16.f32'.
image

After the changes in PR #7409, the PTX generated by the fp8e5 dot test should now be 'mma.sync.aligned.m16n8k32.row.col.f32.e5m2.e5m2.f32'.
image

The test was failing because it expected the old PTX string, but the PTX code generated for the operation in this test had changed after the previous PR.

This update ensures that the test correctly checks the new PTX code, which now matches the changes made in Triton as part of PR #7409.

No other parts of the code were affected, and the test now passes successfully with the updated PTX assertion.
image

Why this change was necessary:

PR #7409 updated the lowering logic for the fp8 dot operation in Triton. The original test assertion was expecting the old PTX code, while the newly generated PTX code is different. This commit updates the test to match the new PTX code generated after PR #7409, ensuring that the test now correctly checks the updated PTX code.

Additional notes:

This change only affects the test and does not impact other functionality.

@Rayfxl Rayfxl requested a review from ptillet as a code owner December 7, 2025 16:01
assert 'wgmma.mma_async.sync.aligned.m64n128k32.f32.e5m2.e5m2' in ptx
elif capability[0] >= 8 and M < 64:
assert 'mma.sync.aligned.m16n8k16.row.col.f32.f16.f16.f32' in ptx
assert 'mma.sync.aligned.m16n8k32.row.col.f32.e5m2.e5m2.f32' in ptx
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is that not going to break A100?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently don't have access to an A100 for testing, but I believe this change will indeed affect the tests on A100, as it will promote the operands to FP16 due to A100's lack of native support for FP8 operands.
Perhaps a more compatible change would be to add an additional condition in the assertion with an or to account for both the updated and older PTX code, ensuring compatibility across different platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants