Merge OpenAI Triton commit `78c8054` #2595

anmyachev · 2024-10-30T17:51:37Z

This PR change the Triton base from 152ef2d to 78c8054 (Oct 27).
Pass rate: 99.84%

Please do not squash and merge this PR.

…ly (#4958) Change to improve platform independence. How it works? On Windows: ```python >>> import sysconfig >>> sysconfig.get_config_var("EXT_SUFFIX") '.cp310-win_amd64.pyd' >>> sysconfig.get_config_var("EXT_SUFFIX").split(".")[-1] 'pyd' ``` On Linux: ```python >>> import sysconfig >>> sysconfig.get_config_var("EXT_SUFFIX") '.cpython-310-x86_64-linux-gnu.so' >>> sysconfig.get_config_var("EXT_SUFFIX").split(".")[-1] 'so' ``` --------- Signed-off-by: Anatoly Myachev <[email protected]>

Specifically, it fixes problems when `srcLayout` and `dstLayout` have different number of registers but the same number of not free registers. We solved the problem by padding free registers to either `srcLayout` or `dstLayout`, but this can be improved by fixing the `invertAndCompose` function.

This adds float16 to the list of dtypes tested in test_tensor_atomic_rmw. Note that the numerics were previously bad for this test when run in float16; this PR "fixes" the numerics by internally doing the sum in float32 (upcast, sum, downcast). Since the purpose is to test the atomic_rmw, and the numerical issues of doing sums in low-precision dtypes are generally know, I think this strategy should be fine for this test.

In the case of 16 bit floats operands for tt::AtomicRMWOp, construct only one LLVM::AtomicRMWOp but use vector of elements. Such approach allows to generate packed intrinsics and process 2 elements at once. Added a lit test for f16 vectorized case.

anmyachev · 2024-10-30T19:13:11Z

lib/Analysis/Utility.cpp

  // comp describes the layout function to create dst from src.
-  LinearLayout comp = dstLayout->invertAndCompose(*srcLayout);
+  LinearLayout comp =
+      dstLayoutWithFreeRegs.invertAndCompose(srcLayoutWithFreeRegs);


Hi @victor-eds! As far as I understand, out code is not ready to work with layouts of the same size. Could you suggest anything?

Some examples of layout representations before and after `resize`

minimalCvtLayout numSrcRegs: 8 numDstRegs: 16 srcLayout: - register=1 -> (0, 1) register=2 -> (0, 2) register=4 -> (0, 4) - lane=1 -> (0, 8) lane=2 -> (0, 16) lane=4 -> (0, 32) lane=8 -> (0, 0) lane=16 -> (0, 0) - warp=1 -> (0, 0) warp=2 -> (0, 0) - block is a size 1 dimension where out dims are: [dim0 (size 1), dim1 (size 64)] srcLayoutWithFreeRegs: - register=1 -> (0, 1) register=2 -> (0, 2) register=4 -> (0, 4) register=8 -> (0, 0) - lane=1 -> (0, 8) lane=2 -> (0, 16) lane=4 -> (0, 32) lane=8 -> (0, 0) lane=16 -> (0, 0) - warp=1 -> (0, 0) warp=2 -> (0, 0) - block is a size 1 dimension where out dims are: [dim0 (size 1), dim1 (size 64)] dstLayout: - register=1 -> (0, 1) register=2 -> (0, 2) register=4 -> (0, 4) register=8 -> (0, 8) - lane=1 -> (0, 16) lane=2 -> (0, 32) lane=4 -> (0, 0) lane=8 -> (0, 0) lane=16 -> (0, 0) - warp=1 -> (0, 0) warp=2 -> (0, 0) - block is a size 1 dimension where out dims are: [dim0 (size 1), dim1 (size 64)] dstLayoutWithFreeRegs: - register=1 -> (0, 1) register=2 -> (0, 2) register=4 -> (0, 4) register=8 -> (0, 8) - lane=1 -> (0, 16) lane=2 -> (0, 32) lane=4 -> (0, 0) lane=8 -> (0, 0) lane=16 -> (0, 0) - warp=1 -> (0, 0) warp=2 -> (0, 0) - block is a size 1 dimension where out dims are: [dim0 (size 1), dim1 (size 64)]

I decided to revert it for now.

I'll create a new issue after this pull request is merged.

…/merge0

#4991)" This reverts commit 15c5e55.

This reverts commit ef407fb.

This reverts commit ef407fb. Accidentally made squash. After this revert I will simply repeat #2595 and make a merge. Sorry for that.

This PR change the Triton base from 152ef2d to 78c8054 (Oct 27). Pass rate: `99.84%` Please do not squash and merge this PR. Repeating #2595.

anmyachev and others added 4 commits October 24, 2024 22:04

anmyachev commented Oct 30, 2024

View reviewed changes

vlad-penkin linked an issue Oct 30, 2024 that may be closed by this pull request

Merge OpenAI Triton till Nov 8th #2577

Closed

etiotto requested a review from whitneywhtsang October 31, 2024 13:07

anmyachev force-pushed the amyachev/merge0 branch from 86db591 to c8d44f1 Compare October 31, 2024 13:56

anmyachev added 2 commits October 31, 2024 14:00

Merge commit '78c8054298a81f578dcd8c79b519981c57dfb665' into amyachev…

40e5807

…/merge0

Revert "[BACKEND] Improve detection of register to register conversion (

77f98f0

#4991)" This reverts commit 15c5e55.

anmyachev force-pushed the amyachev/merge0 branch from c8d44f1 to 77f98f0 Compare October 31, 2024 14:00

anmyachev marked this pull request as ready for review October 31, 2024 14:41

anmyachev requested a review from pbchekin October 31, 2024 14:45

pbchekin approved these changes Oct 31, 2024

View reviewed changes

anmyachev merged commit ef407fb into main Oct 31, 2024
4 checks passed

anmyachev deleted the amyachev/merge0 branch October 31, 2024 15:00

anmyachev added a commit that referenced this pull request Oct 31, 2024

Revert "Merge OpenAI Triton commit 78c8054 (#2595)"

597867a

This reverts commit ef407fb.

anmyachev restored the amyachev/merge0 branch October 31, 2024 15:05

This was referenced Oct 31, 2024

Revert "Merge OpenAI Triton commit 78c8054 (#2595)" #2602

Merged

Reland upstream commit 15c5e55 #2603

Closed

Merge OpenAI Triton commit 78c8054 #2604

Merged

anmyachev added a commit that referenced this pull request Oct 31, 2024

Revert "Merge OpenAI Triton commit 78c8054 (#2595)" (#2602)

0df7d80

This reverts commit ef407fb. Accidentally made squash. After this revert I will simply repeat #2595 and make a merge. Sorry for that.

anmyachev added a commit that referenced this pull request Nov 1, 2024

Merge OpenAI Triton commit 78c8054 (#2604)

c5beb57

This PR change the Triton base from 152ef2d to 78c8054 (Oct 27). Pass rate: `99.84%` Please do not squash and merge this PR. Repeating #2595.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge OpenAI Triton commit `78c8054` #2595

Merge OpenAI Triton commit `78c8054` #2595

Uh oh!

anmyachev commented Oct 30, 2024 •

edited

Loading

Uh oh!

anmyachev Oct 30, 2024 •

edited

Loading

Uh oh!

anmyachev Oct 31, 2024

Uh oh!

anmyachev Oct 31, 2024

Uh oh!

anmyachev Oct 31, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Merge OpenAI Triton commit 78c8054 #2595

Merge OpenAI Triton commit 78c8054 #2595

Uh oh!

Conversation

anmyachev commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anmyachev Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anmyachev Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

anmyachev Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

anmyachev Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Merge OpenAI Triton commit `78c8054` #2595

Merge OpenAI Triton commit `78c8054` #2595

anmyachev commented Oct 30, 2024 •

edited

Loading

anmyachev Oct 30, 2024 •

edited

Loading