Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix
__localargument pointer generation in pocl Vortex deviceWhen I tried to run OpenCL kernels with
__localmemory arguments, I found that the generated argument pointers were incorrect. This Pull Request fixes two bugs in the LLVM IR generation for__localkernel arguments.Bug 1:
ArgOffsetnot updated before reading per-argument offsetWhen processing the first
__localargument,ArgOffsetwas captured before reading__local_sizefrom the argument buffer. After__local_sizeis read,arg_offsetadvances by 4 bytes, but the cachedArgOffsetconstant is never updated. As a result, the per-argument offset field is read from the same position as__local_size, returning the total local memory size instead of the actual offset (which should be 0 for the first argument).Bug 2:
I8PtrTyused instead ofI8Tyin GEPThe GEP instruction that applies the per-argument offset to the local memory base pointer used
I8PtrTy(pointer type, 8 bytes on 64-bit) as the element type instead ofI8Ty(1 byte). This caused the offset to be scaled by 8, placing the pointer far past the correct location in the allocated local memory block.Effect of each bug (local_size = 4,
B - Aexpected = 4)B - AresultI also added a test which verifies the correct pointer layout of two
__localarguments by checking thatB - A == local_work_size. The test successfully passes with this fix and fails under all three buggy configurations.