Skip to content

Machine verifier failure because of register-coalescer - involving sub-registers #98474

@petar-avramovic

Description

@petar-avramovic

When register-coalescer merges small registers into big, some sub-registers end up being undefined in some paths.
short.mir.txt
short.ll.txt

llc -march=amdgcn -mcpu=gfx1010 -verify-machineinstrs short.ll -o -
llc -march=amdgcn -mcpu=gfx1010 -run-pass=register-coalescer -start-before=register-coalescer -verify-machineinstrs short.mir -o -

llc -march=amdgcn -mcpu=gfx1010 -run-pass=register-coalescer -debug short.mir -o -

      bb.0:
        successors: %bb.2(0x50000000), %bb.1(0x30000000)
        liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4

        %0:sgpr_128 = COPY $sgpr0_sgpr1_sgpr2_sgpr3
        %1:sreg_32 = COPY $sgpr4
        S_CMP_EQ_U32 killed %1, 0, implicit-def $scc
        S_CBRANCH_SCC0 %bb.2, implicit killed $scc

      bb.1:
        successors: %bb.3(0x80000000)

        %2:vgpr_32 = IMPLICIT_DEF
        %3:vgpr_32 = IMPLICIT_DEF
        %4:vgpr_32 = IMPLICIT_DEF
        %5:vgpr_32 = IMPLICIT_DEF
        S_BRANCH %bb.3

      bb.2:
        successors: %bb.3(0x80000000)

        %6:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
        %7:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN killed %6, killed %0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (<4 x s32>), align 1, addrspace 8)
        %2:vgpr_32 = COPY %7.sub0
        %3:vgpr_32 = COPY %7.sub1
        %4:vgpr_32 = COPY %7.sub2
        %5:vgpr_32 = COPY killed %7.sub3

      bb.3:
        EXP_DONE 0, killed %2, killed %3, killed %4, killed %5, -1, 0, 15, implicit $exec
        S_ENDPGM 0

$ llc -march=amdgcn -mcpu=gfx1010 -o - short.mir -run-pass=register-coalescer -debug

In short.mir Register Coalescer tries to combine registers %2:vgpr_32, %3:vgpr_32, %4:vgpr_32, %5:vgpr_32 with %7:vreg_128.
In bb.1, instructions:

%2:vgpr_32 = IMPLICIT_DEF
%3:vgpr_32 = IMPLICIT_DEF
%4:vgpr_32 = IMPLICIT_DEF
%5:vgpr_32 = IMPLICIT_DEF

are converted to only one instruction. %2 in %7:sub0 are merged and first instruction is updated to:

undef %7.sub0:vreg_128 = IMPLICIT_DEF

but after others get merged their IMPLICIT_DEFs get deleted.

      bb.0:
        successors: %bb.2(0x50000000), %bb.1(0x30000000)
        liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4
      
        %0:sgpr_128 = COPY $sgpr0_sgpr1_sgpr2_sgpr3
        %1:sreg_32 = COPY $sgpr4
        S_CMP_EQ_U32 %1, 0, implicit-def $scc
        S_CBRANCH_SCC0 %bb.2, implicit killed $scc
      
      bb.1:
        successors: %bb.3(0x80000000)
      
        undef %7.sub0:vreg_128 = IMPLICIT_DEF
        S_BRANCH %bb.3
      
      bb.2:
        successors: %bb.3(0x80000000)
      
        %6:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
        %7:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN %6, %0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (<4 x s32>), align 1, addrspace 8)
      
      bb.3:
        EXP_DONE 0, %7.sub0, %7.sub1, %7.sub2, %7.sub3, -1, 0, 15, implicit $exec
        S_ENDPGM 0

in path bb.0 -> bb.1 -> bb.3 %7.sub1, %7.sub2, %7.sub3 are undefined. This does not cause Machine Verifier failure(yet).
Tail Duplication moves EXP_DONE from bb.3 into bb.2 and more importantly bb.1:

    # After Tail Duplication
    # Machine code for function test: NoPHIs, TracksLiveness, NoVRegs, TracksDebugUserValues

    bb.0:
      successors: %bb.2(0x50000000), %bb.1(0x30000000); %bb.2(62.50%), %bb.1(37.50%)
      liveins: $sgpr4, $sgpr0_sgpr1_sgpr2_sgpr3
      S_CMP_EQ_U32 killed renamable $sgpr4, 0, implicit-def $scc
      S_CBRANCH_SCC0 %bb.2, implicit killed $scc

    bb.1:
    ; predecessors: %bb.0

      renamable $vgpr0 = IMPLICIT_DEF
      EXP_DONE 0, killed renamable $vgpr0, renamable $vgpr1, renamable $vgpr2, renamable $vgpr3, -1, 0, 15, implicit $exec
      S_ENDPGM 0

    bb.2:
    ; predecessors: %bb.0
      liveins: $sgpr0_sgpr1_sgpr2_sgpr3
      renamable $vgpr0 = V_MOV_B32_e32 0, implicit $exec
      renamable $vgpr0_vgpr1_vgpr2_vgpr3 = BUFFER_LOAD_FORMAT_XYZW_IDXEN killed renamable $vgpr0, killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (<4 x s32>), align 1, addrspace 8)
      EXP_DONE 0, killed renamable $vgpr0, renamable $vgpr1, renamable $vgpr2, renamable $vgpr3, -1, 0, 15, implicit $exec
      S_ENDPGM 0

    # End machine code for function test.

This causes machine verifier to fail for $vgpr1, $vgpr2, $vgpr3

*** Bad machine code: Using an undefined physical register ***

  • function: test
  • basic block: %bb.1 (0x569b4a93c240)
  • instruction: EXP_DONE 0, killed renamable $vgpr0, renamable $vgpr1, renamable $vgpr2, renamable $vgpr3, -1, 0, 15, implicit $exec
  • operand 2: renamable $vgpr1an undefined physical register ***

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions