-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[CodeGen] Register-coalescer remat fix subreg liveness #165662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
64bbd4b
54060b3
74bdb76
4df78bb
8e64e71
6ef4f65
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 6 | ||
| # RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx950 -run-pass=register-coalescer -verify-coalescing -o - %s | FileCheck %s | ||
|
|
||
| # This test is to check fix for failure with "Bad machine code: Defining instruction does not modify register" due to corrupt lane mask. | ||
|
|
||
| --- | ||
| name: reg_coalescer_subreg_liveness | ||
| tracksRegLiveness: true | ||
| liveins: | ||
| body: | | ||
| ; CHECK-LABEL: name: reg_coalescer_subreg_liveness | ||
| ; CHECK: bb.0: | ||
| ; CHECK-NEXT: successors: %bb.1(0x80000000) | ||
| ; CHECK-NEXT: liveins: $sgpr4_sgpr5 | ||
| ; CHECK-NEXT: {{ $}} | ||
| ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr_64(p4) = COPY $sgpr4_sgpr5 | ||
| ; CHECK-NEXT: undef [[S_LOAD_DWORD_IMM:%[0-9]+]].sub2:sgpr_128 = S_LOAD_DWORD_IMM [[COPY]](p4), 0, 0 :: (dereferenceable invariant load (s32), align 16, addrspace 4) | ||
| ; CHECK-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]].sub1:sgpr_128 = S_LOAD_DWORD_IMM [[COPY]](p4), 24, 0 :: (dereferenceable invariant load (s32), align 8, addrspace 4) | ||
| ; CHECK-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]].sub0:sgpr_128 = S_MOV_B32 1 | ||
| ; CHECK-NEXT: undef [[S_MOV_B32_:%[0-9]+]].sub0:sgpr_256 = S_MOV_B32 0 | ||
| ; CHECK-NEXT: {{ $}} | ||
| ; CHECK-NEXT: bb.1: | ||
| ; CHECK-NEXT: successors: %bb.2(0x80000000) | ||
| ; CHECK-NEXT: {{ $}} | ||
| ; CHECK-NEXT: S_NOP 0, implicit [[S_LOAD_DWORD_IMM]], implicit [[S_MOV_B32_]] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks a little suspicious: Previously, this operand was %0 which had the following ultimate definitions: Now, the [[S_LOAD_DWORD_IMM]] has the following ultimate defs: The .sub2 def has switched from undef to S_LOAD_DWORD_IMM
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey Jeff, this behavior looks right. The modification to [[S_LOAD_DWORD_IMM]] (the COPY into .sub2) only affected a sub-register that corresponded to an undefined sub-register in the original %0. The compiler still correctly preserves the necessary live values, as sub2 was an undef. The original issue had all sublanes defined. The main issue, this PR is trying to fix is when the copy got rematerialized with the move, the lane masks were not getting updated correctly. I can modify the mir with all sublanes with some define if this mir seems confusing.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey -- I agree that impact that it has had on the behavior of this test is fine -- as you say, it is just changing the bits in a (previously) undef subreg. My concern was that it was suggestive of a deeper issue -- maybe some earlier logic was using the stale lanemasks. I looked a bit at this and it doesn't seem to be related to this remat handling. That said, it still seems a bit fishy to me -- we are unnecessaily doing that S_LOAD and are changing the liveness state of the subregs at the use, presumably, though, the subregs won't change behavior as the were originally undef. Maybe something to look into separately.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be correct, but we still do not want to introduce new definitions for values that were previously undef. It's de-optimizing to introduce a new dependency
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I agree we don't want to introduce new defs in previously undefs. Yes, this is a deeper issue that this patch is not addressing. I will modify the mir to show the issue it is addressing.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I updated the test mir with the actual case where I was seeing the bug.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the difference? Can you include both versions?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The first commit had S_NOP (with implicit to just verify liveness), which was a reduced case. I thought that was not a clear test case. In the actual bug, there was a tensor_load instruction that took a sgpr128 with all four sublanes.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I include that as another test function?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have added the other test as well, Both are similar in terms of the bug,the second is just a synthetic test case I wrote. The first one is a reduced test case from the bug. This commit in both cases doesn't change the outcome of the IR but just the liveness info which was corrupted prior to this commit. |
||
| ; CHECK-NEXT: S_BRANCH %bb.2 | ||
| ; CHECK-NEXT: {{ $}} | ||
| ; CHECK-NEXT: bb.2: | ||
| ; CHECK-NEXT: S_ENDPGM 0 | ||
| bb.0: | ||
| liveins: $sgpr4_sgpr5 | ||
| %5:sgpr_64(p4) = COPY killed $sgpr4_sgpr5 | ||
VigneshwarJ marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| %8:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %5(p4), 0, 0 :: (dereferenceable invariant load (s32) , align 16, addrspace 4) | ||
| %10:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM killed %5(p4), 24, 0 :: (dereferenceable invariant load (s32) , align 8, addrspace 4) | ||
| %7:sreg_32 = S_MOV_B32 1 | ||
| undef %20.sub0:sgpr_128 = COPY %7 | ||
| %0:sgpr_128 = COPY %20 | ||
| %0.sub1:sgpr_128 = COPY killed %10 | ||
| %27:sgpr_128 = COPY %0 | ||
VigneshwarJ marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| %27.sub2:sgpr_128 = COPY killed %8 | ||
| %29:sreg_32 = S_MOV_B32 0 | ||
| undef %30.sub0:sgpr_256 = COPY %29 | ||
| %37:sreg_32 = COPY %7 | ||
| bb.1: | ||
| %1:sreg_32 = COPY killed %37 | ||
| undef %33.sub0:sgpr_128 = COPY %7 | ||
| %33.sub1:sgpr_128 = COPY killed %1 | ||
| S_NOP 0, implicit %0, implicit %30 | ||
| S_BRANCH %bb.2 | ||
| bb.2: | ||
| S_ENDPGM 0 | ||
| ... | ||
Uh oh!
There was an error while loading. Please reload this page.