-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
When register-coalescer merges small registers into big, some sub-registers end up being undefined in some paths.
short.mir.txt
short.ll.txt
llc -march=amdgcn -mcpu=gfx1010 -verify-machineinstrs short.ll -o -
llc -march=amdgcn -mcpu=gfx1010 -run-pass=register-coalescer -start-before=register-coalescer -verify-machineinstrs short.mir -o -
llc -march=amdgcn -mcpu=gfx1010 -run-pass=register-coalescer -debug short.mir -o -
bb.0:
successors: %bb.2(0x50000000), %bb.1(0x30000000)
liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4
%0:sgpr_128 = COPY $sgpr0_sgpr1_sgpr2_sgpr3
%1:sreg_32 = COPY $sgpr4
S_CMP_EQ_U32 killed %1, 0, implicit-def $scc
S_CBRANCH_SCC0 %bb.2, implicit killed $scc
bb.1:
successors: %bb.3(0x80000000)
%2:vgpr_32 = IMPLICIT_DEF
%3:vgpr_32 = IMPLICIT_DEF
%4:vgpr_32 = IMPLICIT_DEF
%5:vgpr_32 = IMPLICIT_DEF
S_BRANCH %bb.3
bb.2:
successors: %bb.3(0x80000000)
%6:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
%7:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN killed %6, killed %0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (<4 x s32>), align 1, addrspace 8)
%2:vgpr_32 = COPY %7.sub0
%3:vgpr_32 = COPY %7.sub1
%4:vgpr_32 = COPY %7.sub2
%5:vgpr_32 = COPY killed %7.sub3
bb.3:
EXP_DONE 0, killed %2, killed %3, killed %4, killed %5, -1, 0, 15, implicit $exec
S_ENDPGM 0
$ llc -march=amdgcn -mcpu=gfx1010 -o - short.mir -run-pass=register-coalescer -debug
In short.mir Register Coalescer tries to combine registers %2:vgpr_32, %3:vgpr_32, %4:vgpr_32, %5:vgpr_32 with %7:vreg_128.
In bb.1, instructions:
%2:vgpr_32 = IMPLICIT_DEF
%3:vgpr_32 = IMPLICIT_DEF
%4:vgpr_32 = IMPLICIT_DEF
%5:vgpr_32 = IMPLICIT_DEF
are converted to only one instruction. %2 in %7:sub0 are merged and first instruction is updated to:
undef %7.sub0:vreg_128 = IMPLICIT_DEF
but after others get merged their IMPLICIT_DEFs get deleted.
bb.0:
successors: %bb.2(0x50000000), %bb.1(0x30000000)
liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4
%0:sgpr_128 = COPY $sgpr0_sgpr1_sgpr2_sgpr3
%1:sreg_32 = COPY $sgpr4
S_CMP_EQ_U32 %1, 0, implicit-def $scc
S_CBRANCH_SCC0 %bb.2, implicit killed $scc
bb.1:
successors: %bb.3(0x80000000)
undef %7.sub0:vreg_128 = IMPLICIT_DEF
S_BRANCH %bb.3
bb.2:
successors: %bb.3(0x80000000)
%6:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
%7:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN %6, %0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (<4 x s32>), align 1, addrspace 8)
bb.3:
EXP_DONE 0, %7.sub0, %7.sub1, %7.sub2, %7.sub3, -1, 0, 15, implicit $exec
S_ENDPGM 0
in path bb.0 -> bb.1 -> bb.3 %7.sub1, %7.sub2, %7.sub3 are undefined. This does not cause Machine Verifier failure(yet).
Tail Duplication moves EXP_DONE from bb.3 into bb.2 and more importantly bb.1:
# After Tail Duplication
# Machine code for function test: NoPHIs, TracksLiveness, NoVRegs, TracksDebugUserValues
bb.0:
successors: %bb.2(0x50000000), %bb.1(0x30000000); %bb.2(62.50%), %bb.1(37.50%)
liveins: $sgpr4, $sgpr0_sgpr1_sgpr2_sgpr3
S_CMP_EQ_U32 killed renamable $sgpr4, 0, implicit-def $scc
S_CBRANCH_SCC0 %bb.2, implicit killed $scc
bb.1:
; predecessors: %bb.0
renamable $vgpr0 = IMPLICIT_DEF
EXP_DONE 0, killed renamable $vgpr0, renamable $vgpr1, renamable $vgpr2, renamable $vgpr3, -1, 0, 15, implicit $exec
S_ENDPGM 0
bb.2:
; predecessors: %bb.0
liveins: $sgpr0_sgpr1_sgpr2_sgpr3
renamable $vgpr0 = V_MOV_B32_e32 0, implicit $exec
renamable $vgpr0_vgpr1_vgpr2_vgpr3 = BUFFER_LOAD_FORMAT_XYZW_IDXEN killed renamable $vgpr0, killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (<4 x s32>), align 1, addrspace 8)
EXP_DONE 0, killed renamable $vgpr0, renamable $vgpr1, renamable $vgpr2, renamable $vgpr3, -1, 0, 15, implicit $exec
S_ENDPGM 0
# End machine code for function test.
This causes machine verifier to fail for $vgpr1, $vgpr2, $vgpr3
*** Bad machine code: Using an undefined physical register ***
- function: test
- basic block: %bb.1 (0x569b4a93c240)
- instruction: EXP_DONE 0, killed renamable $vgpr0, renamable $vgpr1, renamable $vgpr2, renamable $vgpr3, -1, 0, 15, implicit $exec
- operand 2: renamable $vgpr1an undefined physical register ***