-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[AMDGPU] Add target feature for waits before system scope stores. NFC. #164993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,22 +1,50 @@ | ||
| ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @petar-avramovic this test added in #82996 is strange because it never actually showed any waits being inserted.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Think waits are optimized out in final output of ll test, proper test where waits are inserted is mir version of the same test llvm/test/CodeGen/AMDGPU/wait-before-stores-with-scope_sys.mir |
||
| ; RUN: llc -global-isel=0 -mtriple=amdgcn -mcpu=gfx1200 < %s | FileCheck -check-prefix=GFX12 %s | ||
| ; RUN: llc -global-isel=1 -new-reg-bank-select -mtriple=amdgcn -mcpu=gfx1200 < %s | FileCheck -check-prefix=GFX12 %s | ||
| ; RUN: llc -global-isel=0 -mtriple=amdgcn -mcpu=gfx1200 < %s | FileCheck -check-prefix=GFX1200 %s | ||
| ; RUN: llc -global-isel=1 -new-reg-bank-select -mtriple=amdgcn -mcpu=gfx1200 < %s | FileCheck -check-prefix=GFX1200 %s | ||
| ; RUN: llc -global-isel=0 -mtriple=amdgcn -mcpu=gfx1250 < %s | FileCheck -check-prefix=GFX1250-SDAG %s | ||
| ; RUN: llc -global-isel=1 -new-reg-bank-select -mtriple=amdgcn -mcpu=gfx1250 < %s | FileCheck -check-prefix=GFX1250-GISEL %s | ||
|
|
||
| define amdgpu_ps void @intrinsic_store_system_scope(i32 %val, <4 x i32> inreg %rsrc, i32 %vindex, i32 %voffset, i32 inreg %soffset) { | ||
| ; GFX12-LABEL: intrinsic_store_system_scope: | ||
| ; GFX12: ; %bb.0: | ||
| ; GFX12-NEXT: buffer_store_b32 v0, v[1:2], s[0:3], s4 idxen offen scope:SCOPE_SYS | ||
| ; GFX12-NEXT: s_endpgm | ||
| ; GFX1200-LABEL: intrinsic_store_system_scope: | ||
| ; GFX1200: ; %bb.0: | ||
| ; GFX1200-NEXT: buffer_store_b32 v0, v[1:2], s[0:3], s4 idxen offen scope:SCOPE_SYS | ||
| ; GFX1200-NEXT: s_endpgm | ||
| ; | ||
| ; GFX1250-SDAG-LABEL: intrinsic_store_system_scope: | ||
| ; GFX1250-SDAG: ; %bb.0: | ||
| ; GFX1250-SDAG-NEXT: v_dual_mov_b32 v3, v2 :: v_dual_mov_b32 v2, v1 | ||
| ; GFX1250-SDAG-NEXT: buffer_store_b32 v0, v[2:3], s[0:3], s4 idxen offen scope:SCOPE_SYS | ||
| ; GFX1250-SDAG-NEXT: s_endpgm | ||
| ; | ||
| ; GFX1250-GISEL-LABEL: intrinsic_store_system_scope: | ||
| ; GFX1250-GISEL: ; %bb.0: | ||
| ; GFX1250-GISEL-NEXT: v_dual_mov_b32 v4, v1 :: v_dual_mov_b32 v5, v2 | ||
| ; GFX1250-GISEL-NEXT: buffer_store_b32 v0, v[4:5], s[0:3], s4 idxen offen scope:SCOPE_SYS | ||
| ; GFX1250-GISEL-NEXT: s_endpgm | ||
| call void @llvm.amdgcn.struct.buffer.store.i32(i32 %val, <4 x i32> %rsrc, i32 %vindex, i32 %voffset, i32 %soffset, i32 24) | ||
| ret void | ||
| } | ||
|
|
||
| define amdgpu_ps void @generic_store_volatile(i32 %val, ptr addrspace(1) %out) { | ||
| ; GFX12-LABEL: generic_store_volatile: | ||
| ; GFX12: ; %bb.0: | ||
| ; GFX12-NEXT: global_store_b32 v[1:2], v0, off scope:SCOPE_SYS | ||
| ; GFX12-NEXT: s_wait_storecnt 0x0 | ||
| ; GFX12-NEXT: s_endpgm | ||
| ; GFX1200-LABEL: generic_store_volatile: | ||
| ; GFX1200: ; %bb.0: | ||
| ; GFX1200-NEXT: global_store_b32 v[1:2], v0, off scope:SCOPE_SYS | ||
| ; GFX1200-NEXT: s_wait_storecnt 0x0 | ||
| ; GFX1200-NEXT: s_endpgm | ||
| ; | ||
| ; GFX1250-SDAG-LABEL: generic_store_volatile: | ||
| ; GFX1250-SDAG: ; %bb.0: | ||
| ; GFX1250-SDAG-NEXT: v_dual_mov_b32 v3, v2 :: v_dual_mov_b32 v2, v1 | ||
| ; GFX1250-SDAG-NEXT: global_store_b32 v[2:3], v0, off scope:SCOPE_SYS | ||
| ; GFX1250-SDAG-NEXT: s_wait_storecnt 0x0 | ||
| ; GFX1250-SDAG-NEXT: s_endpgm | ||
| ; | ||
| ; GFX1250-GISEL-LABEL: generic_store_volatile: | ||
| ; GFX1250-GISEL: ; %bb.0: | ||
| ; GFX1250-GISEL-NEXT: v_dual_mov_b32 v4, v1 :: v_dual_mov_b32 v5, v2 | ||
| ; GFX1250-GISEL-NEXT: global_store_b32 v[4:5], v0, off scope:SCOPE_SYS | ||
| ; GFX1250-GISEL-NEXT: s_wait_storecnt 0x0 | ||
| ; GFX1250-GISEL-NEXT: s_endpgm | ||
| store volatile i32 %val, ptr addrspace(1) %out | ||
| ret void | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would be easier to read if one line per argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was clang-format's doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly, I don't think
clang-formatworks well with tablegen files. Most of the time I feel it is very ugly after formatting. Lol.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I reformatted it manually more in the style of the other SubtargetFeatures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to turn off clang-format for tablegen. I don't think I've ever seen it make a good suggestion