[TableGen][GISel] Learn to import patterns with physreg defs #120343

s-barannikov · 2024-12-18T01:27:49Z

This change reduces the number of skipped patterns as follows:

AArch64:   8548 ->  8547  (-1)
AMDGPU:   11245 -> 11204  (-41)
BPF:        197 ->   194  (-3)
M68k:       840 ->   838  (-2)
X86:      25857 -> 25856  (-1)

It appears that some patterns were already importable, but the generated code was invalid (the result was discarded).
This used to happen with "standalone" patterns in which the source DAG has more results that the destination DAG, and the destination DAG's root instruction has implicit defs. The most common case is that the source DAG produces one result and the root instruction has no explicit defs (e.g., setcc patterns).
This doesn't error out because we assume that PatternToMatch::getDstRegs() returns non-empty set if any physical register defs are involved (and issue an error if this is the case), but the set is always empty for standalone patterns.

While this patch correctly imports patterns with implicit defs, backends don't seem to be prepared to deal with them.
If, for example, the result of a generic instruction is copied into a physical register, we may end up with something like this:

DEF_MI implicit-def $some_reg1
%vreg = COPY $some_reg1
$some_reg2 = COPY %vreg

The issue with this code is summarized here. In short, we are not constraining %vreg to a proper register class, which causes verification failure "VReg has no regclass after selection".
This can be observed on X86/GlobalISel/mul-scalar.ll test, which I "fixed" by disabling importing the pattern.

github-actions · 2024-12-21T05:31:02Z

✅ With the latest revision this PR passed the C/C++ code formatter.

s-barannikov · 2024-12-21T12:37:20Z

I'd like some help with "VReg has no regclass after selection" issue, I'm not sure how to fix it.

Insert ConstrainOperandToRegClass to an arbitrary register class? Probably wrong.
Insert ConstrainSelectedInstOperands for the virtreg <-- physreg COPY in the hope that backends will constrain the virtual register? They currently don't.
Insert some new ConstrainXXX designed specifically for virtreg <-- physreg copies (and add a new virtual method that target will need to implement)?
Fix the linked FIXME somehow? Probably needs a target hook one way or another, and I don't see a way of how to plug it in.
Let it be for now? Hopefully :)

arsenm · 2024-12-23T07:20:42Z

I'd expect an arbitrary copy from a physical register to be copied to/from a virtual register with class getMinimalPhysRegClass (or whatever the equivalent is in CodeGenRegisters. getRegClassForRegister?).

ConstrainSelectedInstOperands won't work because that just uses the classes from the instruction definition, where generic pseudos like COPY do not have any available register class information. These need manual selection

s-barannikov · 2024-12-23T07:50:04Z

I'd expect an arbitrary copy from a physical register to be copied to/from a virtual register with class getMinimalPhysRegClass (or whatever the equivalent is in CodeGenRegisters. getRegClassForRegister?).

That's what I thought, too. It appears that getMinimalPhysRegClass may return an unallocatable register class and we aren't allowed to create virtual registers with unallocatable classes. This is the case on my target and I believe ARM would suffer from this too as its "condition codes" register class is unallocatable as well. getRegClassForRegister looks better, but it returns a register class that falls under category "some arbitrary register class".

In SelectionDAG, the choice of the class is done by InstrEmitter, and the logic is rather complicated. In the simplest case it just calls TLI->getRegClassFor(), which returns the class registered by a call to addRegisterClass() in target's TargetLowering constructor. At TableGen level, we don't know what register classes are going to be legal for a value type, so we probably need to emit something to do a runtime check. Maybe something like GIR_Constrain[PhysReg]Copy that would call a target hook?

ConstrainSelectedInstOperands won't work because that just uses the classes from the instruction definition, where generic pseudos like COPY do not have any available register class information. These need manual selection

I had to figure it out the hard way :(

arsenm · 2024-12-23T07:54:14Z

That's what I thought, too. It appears that getMinimalPhysRegClass may return an unallocatable register class and we aren't allowed to create virtual registers with unallocatable classes.

You can find the smallest allocatable class with with getAllocatableClass(getMinimalPhysRegClass()), and there's presumably a CodeGenRegisters equivalent too

we don't know what register classes are going to be legal for a value type, so we probably need to emit something to do a runtime check.

No, there should be no need for a runtime or type check. The type based class logic in InstrEmitter is one of the issues with SelectionDAG we're trying to avoid.

arsenm · 2024-12-23T07:56:48Z

You can find the smallest allocatable class with with getAllocatableClass(getMinimalPhysRegClass()), and there's presumably a CodeGenRegisters equivalent too

I think this is iterate through getRegisterClass(RegRecord)->getSuperClasses() until you find an allocatable one

s-barannikov · 2024-12-23T07:57:52Z

I'll give it a try, thanks.

s-barannikov · 2024-12-23T10:47:33Z

So I tried.

The register class containing the register is the only register class containing this register, and it is not allocatable. That is, there are no sub- or super-classes containing this register to iterate over.

I seem to need a "cross copy" register class. There is a TargetRegisterInfo::getCrossCopyRegClass, which I thought may be lifted to TableGen level, but some targets that override it (AMDGPU in particular) choose a register class based on dynamic values such as subtarget features.

This may not be a big issue, we could still have a notion of "cross copy" register class at TableGen level, which could be used in cases like this.

The current idea is to first try getMinimalPhysRegClass and resort to cross-copy register class if it fails to find an allocatable class. Does this sound like the way to go or should I look for another solution?

arsenm · 2024-12-26T13:48:14Z

The register class containing the register is the only register class containing this register, and it is not allocatable. That is, there are no sub- or super-classes containing this register to iterate over.

I seem to need a "cross copy" register class. There is a TargetRegisterInfo::getCrossCopyRegClass, which I thought may be lifted to TableGen level, but some targets that override it (AMDGPU in particular) choose a register class based on dynamic values such as subtarget features.

Both of the AMDGPU cases look like hacks to me. In particular the SCC_CLASS handling does not do what we would want to do in globalisel. (I should probably try removing this but will likely never get around to it)

The current idea is to first try getMinimalPhysRegClass and resort to cross-copy register class if it fails to find an allocatable class. Does this sound like the way to go or should I look for another solution?

Some kind of fallback process makes sense to me. In tablegen getRegClassToRegister, find an allocatable superclass, and if one isn't available just find something you can copy to. The allocation processes later can adjust the exact classes anyway,=

s-barannikov · 2024-12-28T04:17:10Z

CC @e-kud

s-barannikov · 2024-12-28T05:37:25Z

llvm/lib/Target/AMDGPU/SIRegisterInfo.td

+def SCC_CLASS : SIRegisterClass<"AMDGPU", [i1], 1, (add SCC)> {
+  let CopyCost = -1;
+  let isAllocatable = 0;
+  let CrossCopyRegClass = SReg_32_XM0_XEXEC;


I don't know if this is correct. The class is the one returned by getCrossCopyRegClass in Wave32 case.

Yes, this should use one of the flavors of SReg_32. SReg_32 itself should be fine

s-barannikov · 2024-12-28T05:40:57Z

@arsenm Could you take a look at the failing test?
The change allows importing S_CMP_* patterns, and there is one test where this causes machine verifier errors on vgpr->sgpr copy (the argument of S_CMP_*). IIUC this should be fixed by SIFixSGPRCopies pass, but adding it to GCNPassConfig::addGlobalInstructionSelect (and disabling machine verifier in between) didn't help.

  %43:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V1_gfx12 %205:vgpr_32, %39.sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15:sgpr_512, %39.sub0_sub1_sub2_sub3:sgpr_512, 1, 1, 0, 0, 0, -1, 0, 0, 0, implicit $exec :: (dereferenceable load (s32), addrspace 8)
  %224:sreg_32 = COPY killed %43:vgpr_32
  nofpexcept S_CMP_EQ_F32 killed %224:sreg_32, 0, implicit-def $scc, implicit $mode
  %46:sreg_32_xm0_xexec = COPY killed $scc

arsenm · 2025-01-06T04:39:16Z

@arsenm Could you take a look at the failing test? The change allows importing S_CMP_* patterns, and there is one test where this causes machine verifier errors on vgpr->sgpr copy (the argument of S_CMP_*).

IIUC this should be fixed by pass,

Not in GlobalISel. SIFixSGPRCopies is a giant hack for the DAG, half the reason we want globalisel is to delete it. It's probably a regbankselect bug

  %43:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V1_gfx12 %205:vgpr_32, %39.sub8_sub9_sub10_sub11_sub12_sub13_sub14_sub15:sgpr_512, %39.sub0_sub1_sub2_sub3:sgpr_512, 1, 1, 0, 0, 0, -1, 0, 0, 0, implicit $exec :: (dereferenceable load (s32), addrspace 8)
  %224:sreg_32 = COPY killed %43:vgpr_32
  nofpexcept S_CMP_EQ_F32 killed %224:sreg_32, 0, implicit-def $scc, implicit $mode
  %46:sreg_32_xm0_xexec = COPY killed $scc

What was the pre-select MIR?

arsenm · 2025-01-06T04:37:09Z

llvm/lib/Target/AMDGPU/SIRegisterInfo.td

+def SCC_CLASS : SIRegisterClass<"AMDGPU", [i1], 1, (add SCC)> {
+  let CopyCost = -1;
+  let isAllocatable = 0;
+  let CrossCopyRegClass = SReg_32_XM0_XEXEC;


Yes, this should use one of the flavors of SReg_32. SReg_32 itself should be fine

arsenm · 2025-01-06T04:38:41Z

llvm/include/llvm/Target/Target.td

  // value means copying is extremely expensive or impossible.
  int CopyCost = 1;

+  RegisterClass CrossCopyRegClass = ?;


Needs documentation. Also this doesn't seem like the right terminology. It's more of a virtual register class to use for copies with an unallocatable physical register, not a general cross copy. Maybe AllocatableRegClass?

s-barannikov · 2025-01-10T16:43:49Z

What was the pre-select MIR?

Before regbank-select:

  %52:_(s32) = G_AMDGPU_INTRIN_IMAGE_LOAD intrinsic(@llvm.amdgcn.image.sample.lz.2d), 1, %71:_(<2 x s16>), $noreg, %37:_(<8 x s32>), %192:_(<4 x s32>), 0, 0, 0, 1 :: (dereferenceable load (s32), addrspace 8)
  %53:_(s32) = G_FCONSTANT float 1.000000e+00
  %54:_(s1) = G_FCMP floatpred(oeq), %52:_(s32), %53:_

Before instruction-select:

  %52:vgpr(s32) = G_AMDGPU_INTRIN_IMAGE_LOAD intrinsic(@llvm.amdgcn.image.sample.lz.2d), 1, %213:vgpr(<2 x s16>), $noreg, %37:sgpr(<8 x s32>), %192:sgpr(<4 x s32>), 0, 0, 0, 1 :: (dereferenceabl
e load (s32), addrspace 8)
  %53:sgpr(s32) = G_FCONSTANT float 1.000000e+00
  %214:vgpr(s32) = COPY %53:sgpr(s32)
  %54:vcc(s1) = G_FCMP floatpred(oeq), %52:vgpr(s32), %214:vgpr

s-barannikov force-pushed the tablegen/gisel/physreg-patterns branch 3 times, most recently from 16c07cc to c565ea4 Compare December 21, 2024 05:27

s-barannikov changed the title ~~[WIP][TableGen][GISel] Learn to import patterns with optional/physreg defs~~ [WIP][TableGen][GISel] Learn to import patterns with physreg defs Dec 21, 2024

s-barannikov force-pushed the tablegen/gisel/physreg-patterns branch 3 times, most recently from 8f8b945 to 1ca4d60 Compare December 21, 2024 06:59

s-barannikov requested review from RKSimon, aemerson and arsenm December 21, 2024 12:27

s-barannikov changed the title ~~[WIP][TableGen][GISel] Learn to import patterns with physreg defs~~ [TableGen][GISel] Learn to import patterns with physreg defs Dec 21, 2024

s-barannikov force-pushed the tablegen/gisel/physreg-patterns branch from 1ca4d60 to 54a0395 Compare December 22, 2024 15:14

[TableGen][GISel] Learn to import patterns with physreg defs

5d3e267

s-barannikov force-pushed the tablegen/gisel/physreg-patterns branch from 54a0395 to 5d3e267 Compare December 26, 2024 18:35

Try to infer register class for physical register defs

5ac2b21

s-barannikov commented Dec 28, 2024

View reviewed changes

Merge branch 'main' into tablegen/gisel/physreg-patterns

f99334d

s-barannikov force-pushed the tablegen/gisel/physreg-patterns branch from 0f169db to f99334d Compare January 5, 2025 11:42

arsenm reviewed Jan 6, 2025

View reviewed changes

[TableGen][GISel] Learn to import patterns with physreg defs #120343

Are you sure you want to change the base?

[TableGen][GISel] Learn to import patterns with physreg defs #120343

Uh oh!

Conversation

s-barannikov commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s-barannikov commented Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented Dec 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s-barannikov commented Dec 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented Dec 23, 2024

Uh oh!

arsenm commented Dec 23, 2024

Uh oh!

s-barannikov commented Dec 23, 2024

Uh oh!

s-barannikov commented Dec 23, 2024

Uh oh!

arsenm commented Dec 26, 2024

Uh oh!

s-barannikov commented Dec 28, 2024

Uh oh!

s-barannikov Dec 28, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm Jan 6, 2025

Choose a reason for hiding this comment

Uh oh!

s-barannikov commented Dec 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented Jan 6, 2025

Uh oh!

arsenm Jan 6, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Jan 6, 2025

Choose a reason for hiding this comment

Uh oh!

s-barannikov commented Jan 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

s-barannikov commented Dec 18, 2024 •

edited

Loading

github-actions bot commented Dec 21, 2024 •

edited

Loading

s-barannikov commented Dec 21, 2024 •

edited

Loading

arsenm commented Dec 23, 2024 •

edited

Loading

s-barannikov commented Dec 23, 2024 •

edited

Loading

s-barannikov commented Dec 28, 2024 •

edited

Loading