Skip to content

Commit d08e445

Browse files
authored
Greedy: Make eviction broken hint cost use CopyCost units (#160084)
Change the eviction advisor heuristic cost based on number of broken hints to work in units of copy cost, rather than a magic number 1. The intent is to allow breaking hints for cheap subregisters in favor of more expensive register tuples. The llvm.amdgcn.image.dim.gfx90a.ll change shows a simple example of the case I am attempting to solve. Use of tuples in ABI contexts ends up looking like this: %argN = COPY $vgprN %tuple = inst %argN $vgpr0 = COPY %tuple.sub0 $vgpr1 = COPY %tuple.sub1 $vgpr2 = COPY %tuple.sub2 $vgpr3 = COPY %tuple.sub3 Since there are physreg copies in the input and output sequence, both have hints to a physreg. The wider tuple hint on the output should win though, since this satisfies 4 hints instead of 1. This is the obvious part of a larger change to better handle subregister interference with register tuples, and is not sufficient to handle the original case I am looking at. There are several bugs here that are proving tricky to untangle. In particular, there is a double counting bug for all registers with multiple regunits; the cost of breaking the interfering hint is added for each interfering virtual register, which have repeat visits across regunits. Fixing the double counting badly regresses a number of RISCV tests, which seem to rely on overestimating the cost in tryFindEvictionCandidate to avoid early-exiting the eviction candidate loop (RISCV is possibly underestimating the copy costs for vector registers).
1 parent fa19a57 commit d08e445

File tree

7 files changed

+6926
-6928
lines changed

7 files changed

+6926
-6928
lines changed

llvm/lib/CodeGen/RegAllocEvictionAdvisor.cpp

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ bool DefaultEvictionAdvisor::canEvictHintInterference(
224224
const LiveInterval &VirtReg, MCRegister PhysReg,
225225
const SmallVirtRegSet &FixedRegisters) const {
226226
EvictionCost MaxCost;
227-
MaxCost.setBrokenHints(1);
227+
MaxCost.setBrokenHints(MRI->getRegClass(VirtReg.reg())->getCopyCost());
228228
return canEvictInterferenceBasedOnCost(VirtReg, PhysReg, true, MaxCost,
229229
FixedRegisters);
230230
}
@@ -300,12 +300,14 @@ bool DefaultEvictionAdvisor::canEvictInterferenceBasedOnCost(
300300
return false;
301301
// We permit breaking cascades for urgent evictions. It should be the
302302
// last resort, though, so make it really expensive.
303-
Cost.BrokenHints += 10;
303+
Cost.BrokenHints += 10 * MRI->getRegClass(Intf->reg())->getCopyCost();
304304
}
305305
// Would this break a satisfied hint?
306306
bool BreaksHint = VRM->hasPreferredPhys(Intf->reg());
307307
// Update eviction cost.
308-
Cost.BrokenHints += BreaksHint;
308+
if (BreaksHint)
309+
Cost.BrokenHints += MRI->getRegClass(Intf->reg())->getCopyCost();
310+
309311
Cost.MaxWeight = std::max(Cost.MaxWeight, Intf->weight());
310312
// Abort if this would be too expensive.
311313
if (Cost >= MaxCost)

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.dim.gfx90a.ll

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -18,22 +18,19 @@ define amdgpu_ps <4 x float> @load_1d_lwe(<8 x i32> inreg %rsrc, ptr addrspace(1
1818
; GCN-LABEL: load_1d_lwe:
1919
; GCN: ; %bb.0: ; %main_body
2020
; GCN-NEXT: v_mov_b32_e32 v8, 0
21+
; GCN-NEXT: v_mov_b32_e32 v6, v0
2122
; GCN-NEXT: v_mov_b32_e32 v9, v8
2223
; GCN-NEXT: v_mov_b32_e32 v10, v8
2324
; GCN-NEXT: v_mov_b32_e32 v11, v8
2425
; GCN-NEXT: v_mov_b32_e32 v12, v8
25-
; GCN-NEXT: v_mov_b32_e32 v2, v8
26-
; GCN-NEXT: v_mov_b32_e32 v3, v9
27-
; GCN-NEXT: v_mov_b32_e32 v4, v10
28-
; GCN-NEXT: v_mov_b32_e32 v5, v11
29-
; GCN-NEXT: v_mov_b32_e32 v6, v12
30-
; GCN-NEXT: image_load v[2:6], v0, s[0:7] dmask:0xf unorm lwe
26+
; GCN-NEXT: v_mov_b32_e32 v0, v8
27+
; GCN-NEXT: v_mov_b32_e32 v1, v9
28+
; GCN-NEXT: v_mov_b32_e32 v2, v10
29+
; GCN-NEXT: v_mov_b32_e32 v3, v11
30+
; GCN-NEXT: v_mov_b32_e32 v4, v12
31+
; GCN-NEXT: image_load v[0:4], v6, s[0:7] dmask:0xf unorm lwe
3132
; GCN-NEXT: s_waitcnt vmcnt(0)
32-
; GCN-NEXT: v_mov_b32_e32 v0, v2
33-
; GCN-NEXT: v_mov_b32_e32 v1, v3
34-
; GCN-NEXT: v_mov_b32_e32 v2, v4
35-
; GCN-NEXT: v_mov_b32_e32 v3, v5
36-
; GCN-NEXT: global_store_dword v8, v6, s[8:9]
33+
; GCN-NEXT: global_store_dword v8, v4, s[8:9]
3734
; GCN-NEXT: s_waitcnt vmcnt(0)
3835
; GCN-NEXT: ; return to shader part epilog
3936
main_body:

llvm/test/CodeGen/RISCV/rvv/vloxseg-rv32.ll

Lines changed: 1556 additions & 1556 deletions
Large diffs are not rendered by default.

llvm/test/CodeGen/RISCV/rvv/vloxseg-rv64.ll

Lines changed: 1892 additions & 1892 deletions
Large diffs are not rendered by default.

llvm/test/CodeGen/RISCV/rvv/vluxseg-rv32.ll

Lines changed: 1556 additions & 1556 deletions
Large diffs are not rendered by default.

llvm/test/CodeGen/RISCV/rvv/vluxseg-rv64.ll

Lines changed: 1906 additions & 1906 deletions
Large diffs are not rendered by default.

llvm/test/CodeGen/RISCV/zilsd.ll

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,9 @@
77
define i64 @load(ptr %a) nounwind {
88
; CHECK-LABEL: load:
99
; CHECK: # %bb.0:
10-
; CHECK-NEXT: ld a2, 80(a0)
11-
; CHECK-NEXT: ld zero, 0(a0)
12-
; CHECK-NEXT: mv a0, a2
13-
; CHECK-NEXT: mv a1, a3
10+
; CHECK-NEXT: mv a2, a0
11+
; CHECK-NEXT: ld a0, 80(a0)
12+
; CHECK-NEXT: ld zero, 0(a2)
1413
; CHECK-NEXT: ret
1514
%1 = getelementptr i64, ptr %a, i32 10
1615
%2 = load i64, ptr %1

0 commit comments

Comments
 (0)