Skip to content

Commit 24e5c9d

Browse files
authored
Remove some final references to stack maps (#197)
Stack maps are not provided by regalloc2 anymore. This removes the final references to stack maps in the codebase.
1 parent e684ee5 commit 24e5c9d

File tree

2 files changed

+21
-31
lines changed

2 files changed

+21
-31
lines changed

doc/DESIGN.md

Lines changed: 21 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -80,21 +80,21 @@ consists of the following fields:
8080
- VReg, or virtual register. *Every* operand mentions a virtual
8181
register, even if it is constrained to a single physical register in
8282
practice. This is because we track liveranges uniformly by vreg.
83-
83+
8484
- Policy, or "constraint". Every reference to a vreg can apply some
8585
constraint to the vreg at that point in the program. Valid policies are:
86-
86+
8787
- Any location;
8888
- Any register of the vreg's class;
8989
- Any stack slot;
9090
- A particular fixed physical register; or
9191
- For a def (output), a *reuse* of an input register.
92-
92+
9393
- The "kind" of reference to this vreg: Def, Use, Mod. A def
9494
(definition) writes to the vreg, and disregards any possible earlier
9595
value. A mod (modify) reads the current value then writes a new
9696
one. A use simply reads the vreg's value.
97-
97+
9898
- The position: before or after the instruction.
9999
- Note that to have a def (output) register available in a way that
100100
does not conflict with inputs, the def should be placed at the
@@ -159,7 +159,7 @@ block parameters must provide values for those parameters via
159159
operands. When a branch has more than one successor, it provides
160160
separate operands for each possible successor. These block parameters
161161
are equivalent to phi-nodes; we chose this representation because they
162-
are in many ways a more consistent representation of SSA.
162+
are in many ways a more consistent representation of SSA.
163163

164164
To see why we believe block parameters are a slightly nicer design
165165
choice than use of phi nodes, consider: phis are special
@@ -176,8 +176,8 @@ reasonable to handle.
176176
## Output
177177

178178
The allocator produces two main data structures as output: an array of
179-
`Allocation`s and a sequence of edits. Some other data, such as
180-
stackmap slot info, is also provided.
179+
`Allocation`s and a sequence of edits. Some other miscellaneous data is also
180+
provided.
181181

182182
### Allocations
183183

@@ -229,8 +229,7 @@ The livein and liveout bitsets (`liveins` and `liveouts` on the `Env`)
229229
are allocated one per basic block and record, per block, which vregs
230230
are live entering and leaving that block. They are computed using a
231231
standard backward iterative dataflow analysis and are exact; they do
232-
not over-approximate (this turns out to be important for performance,
233-
and is also necessary for correctness in the case of stackmaps).
232+
not over-approximate (this turns out to be important for performance).
234233

235234
### Blockparam Vectors: Source-Side and Dest-Side
236235

@@ -631,7 +630,7 @@ them all here.
631630
across its entire range. This has the effect of causing bundles to
632631
be more important (more likely to evict others) the more they are
633632
split.
634-
633+
635634
- Requirement: a bundle's requirement is a value in a lattice that we
636635
have defined, where top is "Unknown" and bottom is
637636
"Conflict". Between these two, we have: any register (of a class);
@@ -640,7 +639,7 @@ them all here.
640639
different requirements meets to Conflict. Requirements are derived
641640
from the operand constraints for all uses in all liveranges in a
642641
bundle, and then merged with the lattice meet-function.
643-
642+
644643
The lattice is as follows (diagram simplified to remove multiple
645644
classes and multiple fixed registers which parameterize nodes; any two
646645
differently-parameterized values are unordered with respect to each
@@ -1176,13 +1175,13 @@ similarities than the differences.
11761175

11771176
* The core abstractions of "liverange", "bundle", "vreg", "preg", and
11781177
"operand" (with policies/constraints) are the same.
1179-
1178+
11801179
* The overall allocator pipeline is the same, and the top-level
11811180
structure of each stage should look similar. Both allocators begin
11821181
by computing liveranges, then merging bundles, then handling bundles
11831182
and splitting/evicting as necessary, then doing second-chance
11841183
allocation, then reifying the decisions.
1185-
1184+
11861185
* The cost functions are very similar, though the heuristics that make
11871186
decisions based on them are not.
11881187

@@ -1204,33 +1203,33 @@ Several notable high-level differences are:
12041203
and does not depend on scanning the code at all. In general, we
12051204
should be able to state simple invariants and see by inspection (as
12061205
well as fuzzing -- see above) that they hold.
1207-
1206+
12081207
* The data structures themselves are simplified. Where IonMonkey uses
12091208
linked lists in many places, this allocator stores simple inline
12101209
smallvecs of liveranges on bundles and vregs, and smallvecs of uses
12111210
on liveranges. We also (i) find a way to construct liveranges
12121211
in-order immediately, without any need for splicing, unlike
12131212
IonMonkey, and (ii) relax sorting invariants where possible to allow
12141213
for cheap append operations in many cases.
1215-
1214+
12161215
* The splitting heuristics are significantly reworked. Whereas
12171216
IonMonkey has an all-at-once approach to splitting an entire bundle,
12181217
and has a list of complex heuristics to choose where to split, this
12191218
allocator does conflict-based splitting, and tries to decide whether
12201219
to split or evict and which split to take based on cost heuristics.
1221-
1220+
12221221
* The liverange computation is exact, whereas IonMonkey approximates
12231222
using a single-pass algorithm that makes vregs live across entire
12241223
loop bodies. We have found that precise liveness improves allocation
12251224
performance and generated code quality, even though the liveness
12261225
itself is slightly more expensive to compute.
1227-
1226+
12281227
* Many of the algorithms in the IonMonkey allocator are built with
12291228
helper functions that do linear scans. These "small quadratic" loops
12301229
are likely not a huge issue in practice, but nevertheless have the
12311230
potential to be in corner cases. As much as possible, all work in
12321231
this allocator is done in linear scans.
1233-
1232+
12341233
* There are novel schemes for solving certain interesting design
12351234
challenges. One example: in IonMonkey, liveranges are connected
12361235
across blocks by, when reaching one end of a control-flow edge in a
@@ -1246,7 +1245,7 @@ Several notable high-level differences are:
12461245
for the core regalloc. Ion instead has to tweak its definition of
12471246
minimal bundles and create two liveranges that overlap (!) to
12481247
represent the two uses.
1249-
1248+
12501249
* Using block parameters rather than phi-nodes significantly
12511250
simplifies handling of inter-block data movement. IonMonkey had to
12521251
special-case phis in many ways because they are actually quite
@@ -1257,7 +1256,7 @@ Several notable high-level differences are:
12571256
* The allocator supports irreducible control flow and arbitrary block
12581257
ordering (its only CFG requirement is that critical edges are
12591258
split).
1260-
1259+
12611260
* The allocator supports non-SSA code, and has native support for
12621261
handling program moves specially.
12631262

@@ -1278,7 +1277,7 @@ number of general principles:
12781277
an allocation map for each PReg. This turned out to be significantly
12791278
(!) less efficient than Rust's built-in BTree data structures, for
12801279
the usual cache-efficiency vs. pointer-chasing reasons.
1281-
1280+
12821281
* We initially used dense bitvecs, as IonMonkey does, for
12831282
livein/liveout bits. It turned out that a chunked sparse design (see
12841283
below) was much more efficient.
@@ -1302,7 +1301,7 @@ number of general principles:
13021301
append liveranges to in-progress vreg liverange vectors and then
13031302
reverse at the end. The expensive part is a single pass; only the
13041303
bitset computation is a fixpoint loop.
1305-
1304+
13061305
* Sorts are better than always-sorted data structures (like btrees):
13071306
they amortize all the comparison and update cost to one phase, and
13081307
this phase is much more cache-friendly than a bunch of spread-out

src/checker.rs

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -162,15 +162,6 @@ pub enum CheckerError {
162162
op: Operand,
163163
alloc: Allocation,
164164
},
165-
ConflictedValueInStackmap {
166-
inst: Inst,
167-
alloc: Allocation,
168-
},
169-
NonRefValuesInStackmap {
170-
inst: Inst,
171-
alloc: Allocation,
172-
vregs: FxHashSet<VReg>,
173-
},
174165
StackToStackMove {
175166
into: Allocation,
176167
from: Allocation,

0 commit comments

Comments
 (0)