Skip to content

Commit a271d07

Browse files
wenju-heCopilot
andauthored
[libclc] Implement erf/erfc vector function with loop since scalar function is large (#157055)
This PR reduces amdgcn--amdhsa.bc size by 1.8% and nvptx64--nvidiacl.bc size by 4%. Loop trip count is constant and backend can decide whether to unroll. --------- Co-authored-by: Copilot <[email protected]>
1 parent 28d9255 commit a271d07

File tree

3 files changed

+30
-2
lines changed

3 files changed

+30
-2
lines changed
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#include <clc/utils.h>
10+
11+
#if __CLC_VECSIZE_OR_1 >= 2
12+
13+
#ifndef __CLC_IMPL_FUNCTION
14+
#define __CLC_IMPL_FUNCTION __CLC_FUNCTION
15+
#endif
16+
17+
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __CLC_FUNCTION(__CLC_GENTYPE x) {
18+
union {
19+
__CLC_GENTYPE vec;
20+
__CLC_SCALAR_GENTYPE arr[__CLC_VECSIZE_OR_1];
21+
} u_x, u_result;
22+
u_x.vec = x;
23+
for (int i = 0; i < __CLC_VECSIZE_OR_1; ++i)
24+
u_result.arr[i] = __CLC_IMPL_FUNCTION(u_x.arr[i]);
25+
return u_result.vec;
26+
}
27+
28+
#endif // __CLC_VECSIZE_OR_1 >= 2

libclc/clc/lib/generic/math/clc_erf.cl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -507,5 +507,5 @@ _CLC_OVERLOAD _CLC_DEF half __clc_erf(half x) {
507507
#endif
508508

509509
#define __CLC_FUNCTION __clc_erf
510-
#define __CLC_BODY <clc/shared/unary_def_scalarize.inc>
510+
#define __CLC_BODY <clc/shared/unary_def_scalarize_loop.inc>
511511
#include <clc/math/gentype.inc>

libclc/clc/lib/generic/math/clc_erfc.cl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -518,5 +518,5 @@ _CLC_OVERLOAD _CLC_DEF half __clc_erfc(half x) {
518518
#endif
519519

520520
#define __CLC_FUNCTION __clc_erfc
521-
#define __CLC_BODY <clc/shared/unary_def_scalarize.inc>
521+
#define __CLC_BODY <clc/shared/unary_def_scalarize_loop.inc>
522522
#include <clc/math/gentype.inc>

0 commit comments

Comments
 (0)