Skip to content

Conversation

@whitneywhtsang
Copy link
Contributor

@whitneywhtsang whitneywhtsang commented Dec 5, 2024

This PR change the Triton base from a4f1854 to 390e27f (Dec 5).
Pass rate: 93.27%->93.31%

Please do not squash and merge this PR.

bingyizh233 and others added 9 commits December 4, 2024 13:50
…ision gemm (bf16 x s8) (#5337)

In the closed PR triton-lang/triton#4768, I have
written the python + lit tests cases for Ampere small-tile-size mixed
precision gemm (bf16 x s8). While the compilation crash is solved by
another PR, the test cases can be added.

<!---
The core Triton is a small number of people, and we receive many PRs
(thank
you!).  To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the
following
tasks and include the filled-out checklist in your PR description.**

Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them.
-->

# New contributor declaration
- [x] I am not making a trivial change, such as fixing a typo in a
comment.

- [x] I have written a PR description following these
  [rules](https://cbea.ms/git-commit/#why-not-how).

- [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`.

- Select one of the following.
  - [x] I have added tests.
    - `/test` for `lit` tests
    - `/unittest` for C++ tests
    - `/python/test` for end-to-end tests
  - [ ] This PR does not need a test because `FILL THIS IN`.

- Select one of the following.
  - [ ] I have not added any `lit` tests.
- [ ] The `lit` tests I have added follow these [best
practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python
code
    and using the instructions it generates is not minimal.)

---------

Co-authored-by: Christian Sigg <[email protected]>
Reverts triton-lang/triton#5308

This is causing functional regressions in some internal tests
Reverts triton-lang/triton#5281

Reverting this as well since llvm merge has been revert
Introduce `emitHardwareTuple` helper that emits the code to compute the
blockId, warpId, and laneId for a thread and returns them. This PR uses
this helper in a few places.
Integrate code sequence made by @rawnhenry for efficient fp4 upcasting

Co-authored-by: Rawn Henry <[email protected]>
For `uint64_t` the literal `K` is used, which means `unsigned long long`
according to https://docs.python.org/3/c-api/arg.html#numbers. It seems
logical and correct to use literal `L` for type `int64_t`, which means
`long long` C type.

Signed-off-by: Anatoly Myachev <[email protected]>
@whitneywhtsang whitneywhtsang self-assigned this Dec 5, 2024
@whitneywhtsang whitneywhtsang marked this pull request as ready for review December 5, 2024 20:50
@whitneywhtsang whitneywhtsang merged commit 963ba2b into main Dec 5, 2024
5 checks passed
@whitneywhtsang whitneywhtsang deleted the whitneywhtsang/merge branch December 5, 2024 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants