Commit 36fece6
authored
[Windows] Fix '__builtin_clz' on windows (#3312)
Closes #3273
This recent PR in upstream (triton-lang/triton#5621) brought a new
faster logic for `pext_i32` that is used in `ReduceOpToLLVM` pattern.
The new logic of `pext_i32` uses `__builtin_clz` intrinsic, that is
natively available in GCC and Clang, but is missing in MSVC. It seems
that the Windows version of this intrinsic was incorrectly copied from
[the given
source](https://gist.github.com/pps83/3210a2f980fd02bb2ba2e5a1fc4a2ef0#file-ctz_clz-cpp-L44-L55),
so that it misses `r ^ 31` at the end of it, causing `tt.reduce(...)`
lowering to produce incorrect llvm IR in some scenarious.
Signed-off-by: dchigarev <[email protected]>1 parent 4a99671 commit 36fece6
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| |||
0 commit comments