Skip to content

Commit 6e86e11

Browse files
authored
[libc] Use rint builtin for rounding on the GPU (#98345)
Summary: Previously this went through the generic bit-twiddling implementation instead of using the dedicated GPU instruction. This patch adds this in to the utility, mirroring the special-casing of the x64 and aarch targets. This results in much nicer code. The following example shows the opencl device libs implementation on the left and the LLVM libc on the right, https://godbolt.org/z/3ch48ccf5. The libc version is "branchier", but the results seem similar.
1 parent 7e10ad9 commit 6e86e11

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

libc/src/__support/FPUtil/nearest_integer.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,18 @@
1717
#include "x86_64/nearest_integer.h"
1818
#elif defined(LIBC_TARGET_ARCH_IS_AARCH64)
1919
#include "aarch64/nearest_integer.h"
20+
#elif defined(LIBC_TARGET_ARCH_IS_GPU)
21+
22+
namespace LIBC_NAMESPACE {
23+
namespace fputil {
24+
25+
LIBC_INLINE float nearest_integer(float x) { return __builtin_rintf(x); }
26+
27+
LIBC_INLINE double nearest_integer(double x) { return __builtin_rint(x); }
28+
29+
} // namespace fputil
30+
} // namespace LIBC_NAMESPACE
31+
2032
#else
2133

2234
namespace LIBC_NAMESPACE {

0 commit comments

Comments
 (0)