Skip to content

Conversation

@whitneywhtsang
Copy link
Contributor

@whitneywhtsang whitneywhtsang commented Nov 4, 2025

This PR change the Triton base from c172d53 to 00cf53f (Oct 23).
Pass rate: 94.55%->94.54%

matthias-springer and others added 10 commits October 22, 2025 13:37
This commit adds tests for `unrealized_conversion_cast` and make the
visitor in the axis analysis more robust.

The input AxisInfo should not be propagated if that would inject an
AxisInfo with an incorrect rank. That could cause a crash.

`unrealized_conversion_cast` ops typically appear during a dialect
conversion, but they are also useful for debugging / rapid prototyping
purposes. It allows programmers to hand-write the expected low-level IR
and connect it with high-level IR that will be lowered as usual. In such
a scenario, programmers write IR such as "Case 2" in the added test
case. Such IR currently crashes the axis analysis.

Also fix a crash when a function call with multi-dimensional function
argument is analyzed. (This is now triggered due to the improved
`unrealized_conversion_cast` handling.)
This PR add device-side TMA support for gluon. cc @ThomasRaoux
Use the correct condition value when other value exists.
We fix a number of cases where the constancy analysis could be improved.

The code is quite messy, and the whole pass could do with a full
rewrite, but we are not doing so ATM.

This PR was mostly vibecoded, with a cleaning pass afterwards from me.
…s. (#8512)

A few tests in tensor descriptor use "cuda" as device rather than a
'device'
fixture in the test arguments. This PR changes those tests to use
'device'
fixture instead so that third party users without a cuda runtime can run
on these tests.

<!---
The core Triton is a small number of people, and we receive many PRs
(thank
you!).  To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the
following
tasks and include the filled-out checklist in your PR description.**

Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them.
-->

# New contributor declaration
- [x] I am not making a trivial change, such as fixing a typo in a
comment.

- [x] I have written a PR description following these
  [rules](https://cbea.ms/git-commit/#why-not-how).

- [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`.

- Select one of the following.
  - [ ] I have added tests.
    - `/test` for `lit` tests
    - `/unittest` for C++ tests
    - `/python/test` for end-to-end tests
- [x] This PR does not need a test because it is editing the test file
only.

- Select one of the following.
  - [x] I have not added any `lit` tests.
- [ ] The `lit` tests I have added follow these [best
practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python
code
    and using the instructions it generates is not minimal.)

Co-authored-by: Micah Weston <[email protected]>
Add shared memory capacity for `gfx1250` which is 320 kbyte.
This RP fixes the layout and lowering for wmma scaled with small k dim
where the tensor's k dimension is smaller than the a single wmma scaled
instruction's k dimension. Add corresponding lit tests for common cases.
Prevent crash when lowering memdesc of pointer
…n (#8493)

TP > 1 is not supported in this mode
@whitneywhtsang whitneywhtsang self-assigned this Nov 4, 2025
@whitneywhtsang whitneywhtsang marked this pull request as ready for review November 4, 2025 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.