Skip to content

Commit ec548a5

Browse files
jhuber6aaryanshukla
authored andcommitted
[libc] Use rint builtin for rounding on the GPU (llvm#98345)
Summary: Previously this went through the generic bit-twiddling implementation instead of using the dedicated GPU instruction. This patch adds this in to the utility, mirroring the special-casing of the x64 and aarch targets. This results in much nicer code. The following example shows the opencl device libs implementation on the left and the LLVM libc on the right, https://godbolt.org/z/3ch48ccf5. The libc version is "branchier", but the results seem similar.
1 parent 063a102 commit ec548a5

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

libc/src/__support/FPUtil/nearest_integer.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,18 @@
1717
#include "x86_64/nearest_integer.h"
1818
#elif defined(LIBC_TARGET_ARCH_IS_AARCH64)
1919
#include "aarch64/nearest_integer.h"
20+
#elif defined(LIBC_TARGET_ARCH_IS_GPU)
21+
22+
namespace LIBC_NAMESPACE {
23+
namespace fputil {
24+
25+
LIBC_INLINE float nearest_integer(float x) { return __builtin_rintf(x); }
26+
27+
LIBC_INLINE double nearest_integer(double x) { return __builtin_rint(x); }
28+
29+
} // namespace fputil
30+
} // namespace LIBC_NAMESPACE
31+
2032
#else
2133

2234
namespace LIBC_NAMESPACE {

0 commit comments

Comments
 (0)