Commit 6e86e11
authored
[libc] Use
Summary:
Previously this went through the generic bit-twiddling implementation
instead of using the dedicated GPU instruction. This patch adds this in
to the utility, mirroring the special-casing of the x64 and aarch
targets. This results in much nicer code. The following example shows
the opencl device libs implementation on the left and the LLVM libc on
the right, https://godbolt.org/z/3ch48ccf5. The libc version is
"branchier", but the results seem similar.rint builtin for rounding on the GPU (#98345)1 parent 7e10ad9 commit 6e86e11
1 file changed
+12
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
20 | 32 | | |
21 | 33 | | |
22 | 34 | | |
| |||
0 commit comments