Commit 06789cc
authored
[libclc] Optimize ceil/fabs/floor/rint/trunc (#119596)
These functions all map to the corresponding LLVM intrinsics, but the
vector intrinsics weren't being generated. The intrinsic mapping from
CLC vector function to vector intrinsic was working correctly, but the
mapping from OpenCL builtin to CLC function was suboptimally recursively
splitting vectors in halves.
For example, with this change, `ceil(float16)` calls `llvm.ceil.v16f32`
directly once optimizations are applied.
Now also, instead of generating LLVM intrinsics through `__asm` we now
call clang elementwise builtins for each CLC builtin. This should be a
more standard way of achieving the same result
The CLC versions of each of these builtins are also now built and
enabled for SPIR-V targets. The LLVM -> SPIR-V translator maps the
intrinsics to the appropriate OpExtInst, so there should be no
difference in semantics, despite the newly introduced indirection from
OpenCL builtin through the CLC builtin to the intrinsic.
The AMDGPU targets make use of the same `_CLC_DEFINE_UNARY_BUILTIN`
macro to override `sqrt`, so those functions also appear more optimal
with this change, calling the vector `llvm.sqrt.vXf32` intrinsics
directly.1 parent 3d6b2d4 commit 06789cc
File tree
24 files changed
+92
-66
lines changed- libclc
- clc
- include/clc
- math
- lib
- clspv
- generic
- math
- spirv64
- spirv
- generic/lib/math
24 files changed
+92
-66
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
191 | 191 | | |
192 | 192 | | |
193 | 193 | | |
194 | | - | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
195 | 209 | | |
196 | 210 | | |
197 | 211 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | | - | |
| 4 | + | |
10 | 5 | | |
11 | | - | |
12 | | - | |
13 | 6 | | |
14 | | - | |
15 | | - | |
| 7 | + | |
16 | 8 | | |
17 | | - | |
| 9 | + | |
| 10 | + | |
18 | 11 | | |
19 | 12 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | | - | |
| 4 | + | |
10 | 5 | | |
11 | | - | |
12 | | - | |
13 | 6 | | |
14 | | - | |
15 | | - | |
| 7 | + | |
16 | 8 | | |
17 | | - | |
| 9 | + | |
| 10 | + | |
18 | 11 | | |
19 | 12 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | | - | |
| 4 | + | |
10 | 5 | | |
11 | | - | |
12 | | - | |
13 | 6 | | |
14 | | - | |
15 | | - | |
| 7 | + | |
16 | 8 | | |
17 | | - | |
| 9 | + | |
| 10 | + | |
18 | 11 | | |
19 | 12 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | | - | |
| 4 | + | |
10 | 5 | | |
11 | | - | |
12 | | - | |
13 | 6 | | |
14 | | - | |
15 | | - | |
| 7 | + | |
16 | 8 | | |
17 | | - | |
| 9 | + | |
| 10 | + | |
18 | 11 | | |
19 | 12 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | | - | |
| 4 | + | |
10 | 5 | | |
11 | | - | |
12 | | - | |
13 | 6 | | |
14 | | - | |
15 | | - | |
| 7 | + | |
16 | 8 | | |
17 | | - | |
| 9 | + | |
| 10 | + | |
18 | 11 | | |
19 | 12 | | |
File renamed without changes.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
4 | 9 | | |
5 | 10 | | |
6 | 11 | | |
| |||
0 commit comments