Commit d28bb1f
authored
[AMDGPU] Ensure non-reserved CSR spilled regs are live-in (#146427)
Fixes:
```
*** Bad machine code: Using an undefined physical register ***
- function: widget
- basic block: %bb.0 bb (0x564092cbe140)
- instruction: $vgpr63 = V_ACCVGPR_READ_B32_e64 killed $agpr13, implicit $exec
- operand 1: killed $agpr13
LLVM ERROR: Found 1 machine code errors.
```
The detailed sequence of events that led to this assert:
1. MachineVerifier fails because `$agpr13` is not defined on line 19
below:
``` 1: bb.0.bb:
2: successors: %bb.1(0x80000000); %bb.1(100.00%)
3: liveins: $agpr14, $agpr15, $sgpr12, $sgpr13, $sgpr14, \
4: $sgpr15, $sgpr30, $sgpr31, $sgpr34, $sgpr35, \
5: $sgpr36, $sgpr37, $sgpr38, $sgpr39, $sgpr48, \
6: $sgpr49, $sgpr50, $sgpr51, $sgpr52, $sgpr53, \
7: $sgpr54, $sgpr55, $sgpr64, $sgpr65, $sgpr66, \
8: $sgpr67, $sgpr68, $sgpr69, $sgpr70, $sgpr71, \
9: $sgpr80, $sgpr81, $sgpr82, $sgpr83, $sgpr84, \
10: $sgpr85, $sgpr86, $sgpr87, $sgpr96, $sgpr97, \
11: $sgpr98, $sgpr99, $vgpr0, $vgpr31, $vgpr40, $vgpr41, \
12: $sgpr4_sgpr5, $sgpr6_sgpr7, $sgpr8_sgpr9, \
13: $sgpr10_sgpr11
14: $sgpr16 = COPY $sgpr33
15: $sgpr33 = frame-setup COPY $sgpr32
16: $sgpr18_sgpr19 = S_XOR_SAVEEXEC_B64 -1, \
17: implicit-def $exec, implicit-def dead $scc, \
18: implicit $exec
19: $vgpr63 = V_ACCVGPR_READ_B32_e64 killed $agpr13, \
20: implicit $exec
21: BUFFER_STORE_DWORD_OFFSET killed $vgpr63, \
22: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr33, 0, 0, 0, \
23: implicit $exec :: (store (s32) into %stack.38, \
24: addrspace 5)
25: ...
26: $vgpr43 = IMPLICIT_DEF
27: $vgpr43 = SI_SPILL_S32_TO_VGPR $sgpr15, 0, \
28: killed $vgpr43(tied-def 0)
29: $vgpr43 = SI_SPILL_S32_TO_VGPR $sgpr14, 1, \
30: killed $vgpr43(tied-def 0)
31: $sgpr100_sgpr101 = S_OR_SAVEEXEC_B64 -1, \
32: implicit-def $exec, implicit-def dead $scc, \
33: implicit $exec
34: renamable $agpr13 = COPY killed $vgpr43, implicit $exec
```
2. That instruction is created by
[`emitCSRSpillStores`](https://github.com/llvm/llvm-project/blob/d599bdeaa49d7a2b1246328630328d23ddda5a47/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp#L977)
(called by
[`SIFrameLowering::emitPrologue`](https://github.com/llvm/llvm-project/blob/d599bdeaa49d7a2b1246328630328d23ddda5a47/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp#L1122))
because `$agpr13` is in `WWMSpills`.
See lines 982, 998, and 993 below.
```
977: // Spill Whole-Wave Mode VGPRs. Save only the inactive lanes of the
scratch
978: // registers. However, save all lanes of callee-saved VGPRs. Due to
this, we
979: // might end up flipping the EXEC bits twice.
980: Register ScratchExecCopy;
981: SmallVector<std::pair<Register, int>, 2> WWMCalleeSavedRegs,
WWMScratchRegs;
982: FuncInfo->splitWWMSpillRegisters(MF, WWMCalleeSavedRegs,
WWMScratchRegs);
983: if (!WWMScratchRegs.empty())
984: ScratchExecCopy =
985: buildScratchExecCopy(LiveUnits, MF, MBB, MBBI, DL,
986: /*IsProlog*/ true, /*EnableInactiveLanes*/ true);
987:
988: auto StoreWWMRegisters =
989: [&](SmallVectorImpl<std::pair<Register, int>> &WWMRegs) {
990: for (const auto &Reg : WWMRegs) {
991: Register VGPR = Reg.first;
992: int FI = Reg.second;
993: buildPrologSpill(ST, TRI, *FuncInfo, LiveUnits, MF, MBB, MBBI, DL,
994: VGPR, FI, FrameReg);
995: }
996: };
997:
998: StoreWWMRegisters(WWMScratchRegs);
```
3. `$agpr13` got added to `WWMSpills` by
[`SILowerWWMCopies::run`](https://github.com/llvm/llvm-project/blob/59a7185dd9d69cbf737a98f5c2d1cf3d456bee03/llvm/lib/Target/AMDGPU/SILowerWWMCopies.cpp#L137)
as it processed the `WWM_COPY` on line 3 below (corresponds to line 34
above in point #_1_):
```
1: %45:vgpr_32 = SI_SPILL_S32_TO_VGPR $sgpr15, 0, %45:vgpr_32(tied-def
0)
2: %45:vgpr_32 = SI_SPILL_S32_TO_VGPR $sgpr14, 1, %45:vgpr_32(tied-def
0)
3: %44:av_32 = WWM_COPY %45:vgpr_32
```1 parent a7425f9 commit d28bb1f
File tree
2 files changed
+104
-0
lines changed- llvm
- lib/Target/AMDGPU
- test/CodeGen/AMDGPU
2 files changed
+104
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
983 | 983 | | |
984 | 984 | | |
985 | 985 | | |
| 986 | + | |
986 | 987 | | |
987 | 988 | | |
988 | 989 | | |
| |||
1005 | 1006 | | |
1006 | 1007 | | |
1007 | 1008 | | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
1008 | 1015 | | |
1009 | 1016 | | |
1010 | 1017 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
0 commit comments