Skip to content
This repository was archived by the owner on Jan 26, 2024. It is now read-only.

Conversation

@ldrumm
Copy link

@ldrumm ldrumm commented Oct 2, 2023

kernels_[blitType] yields a null function pointer in KernelBlitManager::initHeap due to KernelBlitManager::createProgram not initializing all the kernels because of broken layout invariants.

KernBlitManager::NumBlitKernels has two possible return values: BlitTotal, and BlitLinearTotal which are sentinels. These sentinels are used in KernelBlitManager::createProgram to initialize the kernels_ array. It correctly initializes [0, NumBlitKernels()), but InitHeap is > NumBlitKernels, so the InitHeap kernel is not loaded.

Thus, when images are disabled, and a kernel has an hidden_heap_v1 entry, the InitHeap blitkernel is not loaded, and kernelBlitManager::initHeap subsequently gets a null pointer. The kernel always has such hidden_heap_v1 descriptor entry at -O0, but I believe it's possible (unconfirmed) for this situation to occur in other circumstances.

The fix is simply to ensure the InitHeap enumeration has a numeric value less than BlitLinearTotal.

n.b. This bug does not exist on the 5.7 release, but it looks like that's by chance.

`kernels_[blitType]` yields a null function pointer in
`KernelBlitManager::initHeap` due to `KernelBlitManager::createProgram`
not initializing all the kernels because of broken layout invariants.

`KernBlitManager::NumBlitKernels` has two possible return values:
`BlitTotal, and BlitLinearTotal` which are sentinels. These sentinels
are used in `KernelBlitManager::createProgram` to initialize the
`kernels_` array. It correctly initializes `[0, NumBlitKernels())`, but
`InitHeap` is > `NumBlitKernels`, so the `InitHeap` kernel is not loaded.

Thus, when images are disabled, and a kernel has an `hidden_heap_v1`
entry, the `InitHeap` blitkernel is not loaded, and
`kernelBlitManager::initHeap` subsequently gets a null pointer. The
kernel always has such `hidden_heap_v1` descriptor entry at -O0, but I
believe it's possible (unconfirmed) for this situation to occur in other
circumstances.

The fix is simply to ensure the `InitHeap` enumeration has a numeric
value *less than* `BlitLinearTotal`.

n.b. This bug does not exist on the 5.7 release, but it looks like
that's by chance.
@ldrumm
Copy link
Author

ldrumm commented Oct 3, 2023

Wrong repo.
New patch: ROCm/clr#13

@ldrumm ldrumm closed this Oct 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant