-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[AMDGPU] Restrict promote alloca on pointers across address spaces #119762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
770364b
02e37c6
edf48ab
d52e0a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -198,3 +198,33 @@ entry: | |
| %tmp = load ptr addrspace(3), ptr addrspace(5) %alloca, align 8 | ||
| ret ptr addrspace(3) %tmp | ||
| } | ||
|
|
||
| ; Will not vectorize because we're saving a 64-bit pointer from addrspace 0 | ||
| ; in to two 32 bits pointers of addrspace 5. | ||
| ; CHECK-LABEL: define void @alloca_load_store_ptr_mixed_addrspace_ptrvec | ||
| ; CHECK-NEXT: entry: | ||
| ; CHECK-NEXT: [[ALLOCA:%.*]] = alloca <2 x ptr addrspace(5)>, align 8, addrspace(5) | ||
| ; CHECK-NEXT: store ptr undef, ptr addrspace(5) [[ALLOCA]], align 8 | ||
| ; CHECK-NEXT: ret void | ||
| ; | ||
| define void @alloca_load_store_ptr_mixed_addrspace_ptrvec() { | ||
| entry: | ||
| %A2 = alloca <2 x ptr addrspace(5)>, align 8, addrspace(5) | ||
| store ptr undef, ptr addrspace(5) %A2, align 8 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should also test with a real value instead of an undef. A test with a constant leaf would be useful too |
||
| ret void | ||
| } | ||
|
|
||
| ; Will not vectorize because we're saving a 32-bit pointers from addrspace 5 | ||
| ; in to two 64 bits pointers of addrspace 0, even though the size in memory | ||
| ; is same. | ||
| ; CHECK-LABEL: define void @alloca_load_store_ptr_mixed_addrspace_ptrvec2 | ||
| ; CHECK-NEXT: entry: | ||
| ; CHECK-NEXT: [[ALLOCA:%.*]] = alloca <2 x ptr>, align 8 | ||
| ; CHECK-NEXT: store <4 x ptr addrspace(5)> undef, ptr [[ALLOCA]], align 8 | ||
| ; CHECK-NEXT: ret void | ||
| define void @alloca_load_store_ptr_mixed_addrspace_ptrvec2() { | ||
| entry: | ||
| %A2 = alloca <2 x ptr>, align 8 | ||
| store <4 x ptr addrspace(5)> undef, ptr %A2, align 8 | ||
| ret void | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should reproduce the same situation with one as a scalar, and some with int/fp types
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you please be specific here on what you mean by scalar ?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i32 and double |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no copy across address spaces here, this check is conceptually wrong. You only need to verify the size is compatible. For the final code emission, you'll need to insert no-op casts to get the types to match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the existing code here is correct, and you would only need adjustment at the code transformation later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me explore the no-op casts.
So <2 x ptr addrspace(5)> can be stored in to ptr addrspace (0) ? Isn't this the undefined behavior ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be illegal type punning but you don't have the context. It's just bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Matt. I will try a few ways to avoid this casting assert.
Another quick question, at which point in optimization pipeline, we decide this is an illegal type punning leading to undefined behavior ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a property of the pass pipeline, it's a property of the system as a whole. The address space cast from addrspace(5) to 0 is not a no-op cast. If you reload it and use it as the wrong type, it will be an invalid pointer