Skip to content

Commit 5223317

Browse files
authored
[libclc] Add generic native half implementation of __clc_normalize (#150165)
This is ported from https://github.com/intel/llvm/blob/sycl/libclc/libspirv/lib/generic/geometric/normalize.cl and can pass a closed-source OpenCL CTS "test_geometrics geom_normalize --half CL_DEVICE_TYPE_GPU" on intel GPU. llvm-diff amdgcn--amdhsa.bc shows fpext/fptrunc insts are now removed from normalize function.
1 parent bcd0d97 commit 5223317

File tree

1 file changed

+8
-9
lines changed

1 file changed

+8
-9
lines changed

libclc/clc/lib/generic/geometric/clc_normalize.inc

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,8 @@
1010
#if (__CLC_VECSIZE_OR_1 == 1 || __CLC_VECSIZE_OR_1 == 2 || \
1111
__CLC_VECSIZE_OR_1 == 3 || __CLC_VECSIZE_OR_1 == 4)
1212

13-
// Until we have a native FP16 implementation, go via FP32
14-
#if __CLC_FPSIZE == 16
15-
16-
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_normalize(__CLC_GENTYPE p) {
17-
return __CLC_CONVERT_GENTYPE(__clc_normalize(__CLC_CONVERT_FLOATN(p)));
18-
}
19-
2013
// Scalar normalize
21-
#elif defined(__CLC_SCALAR)
14+
#if defined(__CLC_SCALAR)
2215

2316
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_normalize(__CLC_GENTYPE p) {
2417
return __clc_sign(p);
@@ -27,7 +20,13 @@ _CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_normalize(__CLC_GENTYPE p) {
2720
// Vector normalize
2821
#else
2922

30-
#if __CLC_FPSIZE == 32
23+
#if __CLC_FPSIZE == 16
24+
25+
#define MIN_VAL HALF_MIN
26+
#define MAX_SQRT 0x1.0p+8h
27+
#define MIN_SQRT 0x1.0p-8h
28+
29+
#elif __CLC_FPSIZE == 32
3130

3231
#define MIN_VAL FLT_MIN
3332
#define MAX_SQRT 0x1.0p+86F

0 commit comments

Comments
 (0)