Skip to content

Conversation

@hvdijk
Copy link
Contributor

@hvdijk hvdijk commented Jun 25, 2025

For a kernel such as

kernel void foo(__global double3 *z) {
  double3 x = {0.6631661088,0.6612268107,0.1513627528};
  int3 y = {-1980459213,-660855407,615708204};
  *z = pown(x, y);
}

we were not storing anything to z, because the implementation of pown relied on an floating-point-to-integer conversion where the floating-point value was outside of the integer's range. Although in LLVM IR we permit that operation so long as we end up ignoring its result -- that is the general rule for poison -- one thing we are not permitted to do is have conditional branches that depend on it, and through the call to __clc_ldexp, we did have that.

To fix this, rather than changing expv at the end to INFINITY/0, we can change v at the start to values that we know will produce INFINITY/0 without performing such out-of-range conversions.

Tested with

clang --target=nvptx64 -S -O3 -o - test.cl \
  -Xclang -mlink-builtin-bitcode \
  -Xclang runtimes/runtimes-bins/libclc/nvptx64--.bc

A grep showed that this exact same code existed in three more places, so I changed it there too, though I did not do a broader search for other similar code that potentially has the same problem.

For a kernel such as

  kernel void foo(__global double3 *z) {
    double3 x = {0.6631661088,0.6612268107,0.1513627528};
    int3 y = {-1980459213,-660855407,615708204};
    *z = pown(x, y);
  }

we were not storing anything to z, because the implementation of pown
relied on an floating-point-to-integer conversion where the
floating-point value was outside of the integer's range. Although in
LLVM IR we permit that operation so long as we end up ignoring its
result -- that is the general rule for poison -- one thing we are not
permitted to do is have conditional branches that depend on it, and
through the call to __clc_ldexp, we did have that.

To fix this, rather than changing expv at the end to INFINITY/0, we can
change v at the start to values that we know will produce INFINITY/0
without performing such out-of-range conversions.

Tested with

  clang --target=nvptx64 -S -O3 -o - test.cl \
    -Xclang -mlink-builtin-bitcode \
    -Xclang runtimes/runtimes-bins/libclc/nvptx64--.bc

A grep showed that this exact same code existed in three more places, so
I changed it there too, though I did not do a broader search for other
similar code that potentially has the same problem.
@hvdijk hvdijk requested a review from frasercrmck June 25, 2025 13:22
Copy link
Contributor

@frasercrmck frasercrmck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hvdijk hvdijk merged commit 46ee7f1 into llvm:main Jun 25, 2025
8 checks passed
@hvdijk hvdijk deleted the libclc-no-out-of-range-conv branch June 25, 2025 15:37
anthonyhatran pushed a commit to anthonyhatran/llvm-project that referenced this pull request Jun 26, 2025
For a kernel such as

    kernel void foo(__global double3 *z) {
      double3 x = {0.6631661088,0.6612268107,0.1513627528};
      int3 y = {-1980459213,-660855407,615708204};
      *z = pown(x, y);
    }

we were not storing anything to z, because the implementation of pown
relied on an floating-point-to-integer conversion where the
floating-point value was outside of the integer's range. Although in
LLVM IR we permit that operation so long as we end up ignoring its
result -- that is the general rule for poison -- one thing we are not
permitted to do is have conditional branches that depend on it, and
through the call to __clc_ldexp, we did have that.

To fix this, rather than changing expv at the end to INFINITY/0, we can
change v at the start to values that we know will produce INFINITY/0
without performing such out-of-range conversions.

Tested with

    clang --target=nvptx64 -S -O3 -o - test.cl \
      -Xclang -mlink-builtin-bitcode \
      -Xclang runtimes/runtimes-bins/libclc/nvptx64--.bc

A grep showed that this exact same code existed in three more places, so
I changed it there too, though I did not do a broader search for other
similar code that potentially has the same problem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants