|
| 1 | +# Fastalloc Design Overview |
| 2 | + |
| 3 | +Fastalloc is a register allocator made specifically for fast |
| 4 | +compile times. It's based on the reverse linear scan register |
| 5 | +allocation/SSRA algorithm. |
| 6 | +This document describes the data structures used and the allocation steps. |
| 7 | + |
| 8 | +# Data Structures |
| 9 | + |
| 10 | +The main data structures that Fastalloc uses to track its state are |
| 11 | +described below. |
| 12 | + |
| 13 | +## Current VReg Allocations (`vreg_allocs`) |
| 14 | + |
| 15 | +This is a vector that is used to hold the current allocation for every |
| 16 | +VReg during execution. |
| 17 | + |
| 18 | +## VReg Spillslots (`vreg_spillslots`) |
| 19 | + |
| 20 | +Whenever a VReg needs a spillslot, a dedicated slot is allocated for it. |
| 21 | +This vector is where all VReg's spillslots are stored. |
| 22 | + |
| 23 | +## Live VRegs (`live_vregs`) |
| 24 | + |
| 25 | +Live VReg information is kept in a `VRegSet`, a doubly linked list |
| 26 | +based on a vector. This is used for quick insertion, removal, and |
| 27 | +iteration. |
| 28 | + |
| 29 | +## Least Recently Used Caches (`lrus`) |
| 30 | + |
| 31 | +Every register class (int, float, and vector) has its own LRU and they |
| 32 | +are stored together in an array: `lrus`. An LRU is represented similarly |
| 33 | +to a `VRegSet`: it's a circular, doubly-linked list based on a vector. |
| 34 | + |
| 35 | +The last PReg in an LRU is the least-recently allocated PReg: |
| 36 | + |
| 37 | +most recently used PReg (head) -> 2nd MRU PReg -> ... -> LRU PReg |
| 38 | + |
| 39 | +## Current VReg In PReg Info (`vreg_in_preg`) |
| 40 | + |
| 41 | +During allocation, it's necessary to determine which VReg is in a PReg |
| 42 | +to generate the right move(s) for eviction. |
| 43 | +`vreg_in_preg` is a vector that stores this information. |
| 44 | + |
| 45 | +## Available PRegs For Use In Instruction (`available_pregs`) |
| 46 | + |
| 47 | +This is a 2-tuple of `PRegSet`s, a bitset of physical registers, one for |
| 48 | +the instruction's early phase and one for the late phase. |
| 49 | +They are used to determine which registers are available for use in the |
| 50 | +early/late phases of an instruction. |
| 51 | + |
| 52 | +Prior to the beginning of any instruction's allocation, this set is reset |
| 53 | +to include all allocatable physical registers, some of which may already |
| 54 | +contain a VReg. |
| 55 | + |
| 56 | +## VReg Liverange Location Info (`vreg_to_live_inst_range`) |
| 57 | + |
| 58 | +This is a vector of 3-tuples containing the beginning and the end |
| 59 | +of all VReg's liveranges, along with an allocation they are guaranteed |
| 60 | +to be in throughout that liverange. |
| 61 | +This is used to build the debug locations vector after allocation |
| 62 | +is complete. |
| 63 | + |
| 64 | +# Allocation Process Breakdown |
| 65 | + |
| 66 | +Allocation proceeds in reverse: from the last block to the first block, |
| 67 | +and in each block: from the last instruction to the first instruction. |
| 68 | + |
| 69 | +The allocation for each operand in an instruction can be viewed to happen |
| 70 | +in four phases: selection, assignment, eviction, and edit insertion. |
| 71 | + |
| 72 | +## Allocation Phase: Selection |
| 73 | + |
| 74 | +In this phase, a PReg is selected from `available_pregs` for the |
| 75 | +operand based on the operand constraints. Depending on the operand's |
| 76 | +position the selected PReg is removed from either the early or late |
| 77 | +phase or both, indicating that the PReg is no longer available for |
| 78 | +allocation by other operands in that phase. |
| 79 | + |
| 80 | +## Allocation Phase: Assignment |
| 81 | + |
| 82 | +In this phase, the selected PReg is set as the allocation for |
| 83 | +the operand in the final output. |
| 84 | + |
| 85 | +## Allocation Phase: Eviction |
| 86 | + |
| 87 | +In this phase, the previous VReg in the allocation assigned to |
| 88 | +an operand is evicted, if any. |
| 89 | + |
| 90 | +During eviction, a dedicated spillslot is allocated for the evicted |
| 91 | +VReg and an edit is inserted after the instruction to move from the |
| 92 | +slot to the allocation it's expected to be in after the instruction. |
| 93 | + |
| 94 | +## Allocation Phase: Edit Insertion |
| 95 | + |
| 96 | +In this phase, edits are inserted to ensure that the dataflow from |
| 97 | +before the instruction to the selected allocation to after |
| 98 | +the instruction remain correct. |
| 99 | + |
| 100 | +# Invariants |
| 101 | + |
| 102 | +Some invariants that remain true throughout execution: |
| 103 | + |
| 104 | +1. During processing, the allocation of a VReg at any point in time |
| 105 | +as indicated in `vreg_allocs` changes exactly twice or thrice. |
| 106 | +Initially it is set to none. When it's allocated, it is |
| 107 | +changed to that allocation. After this, it doesn't change unless |
| 108 | +it's evicted or spilled across a block boundary; |
| 109 | +if it is, then its current allocation will change to its dedicated |
| 110 | +spillslot. After this, it doesn't change again until it's definition |
| 111 | +is reached and it's deallocated, during which its `vreg_allocs` |
| 112 | +entry is set to none. The only exception is block parameters that |
| 113 | +are never used: these are never allocated. |
| 114 | + |
| 115 | +2. A virtual register that outlives the block it was defined in will |
| 116 | +be in its dedicated spillslot by the end of the block. |
| 117 | + |
| 118 | +3. At the end of a block, before edits are inserted to move values |
| 119 | +from branch arguments to block parameters spillslots, all branch |
| 120 | +arguments will be in their dedicated spillslots. |
| 121 | + |
| 122 | +4. At the beginning of a block, all branch parameters and livein |
| 123 | +virtual registers will be in their dedicated spillslots. |
| 124 | + |
| 125 | +# Instruction Allocation |
| 126 | + |
| 127 | +To allocate a single instruction, the first step is to reset the |
| 128 | +`available_pregs` sets to all allocated PRegs. |
| 129 | + |
| 130 | +Next, the selection phase is carried out for all operands with |
| 131 | +fixed register constraints: the registers they are constrained to use are |
| 132 | +marked as unavailable in the `available_pregs` set, depending on the |
| 133 | +phase that they are valid in. If the operand is an early use or late |
| 134 | +def operand, then the register will be marked as unavailable in the |
| 135 | +early set or late set, respectively. Otherwise, the PReg is marked |
| 136 | +as unavailable in both the early and late sets, because a PReg |
| 137 | +assigned to an early def or late use operand cannot be reused by another |
| 138 | +operand in the same instruction. |
| 139 | + |
| 140 | +After selection for fixed register operands, the eviction phase is |
| 141 | +carried out for fixed register operands. Any VReg in their selected |
| 142 | +registers, indicated by `vreg_in_preg`, is evicted: a dedicated |
| 143 | +spillslot is allocated for the VReg (if it doesn't have one already), |
| 144 | +an edit is inserted to move from the slot to the PReg, which is where |
| 145 | +the VReg expected to be after the instruction, and its current |
| 146 | +allocation in `vreg_allocs` is set to the spillslot. |
| 147 | + |
| 148 | +Next, all clobbers are removed from the early and late `available_pregs` |
| 149 | +sets to avoid allocating a clobber to a def. |
| 150 | + |
| 151 | +Next, the selection, assignment, eviction, and edit insertion phases are |
| 152 | +carried out for all def operands. When each def operand's allocation is |
| 153 | +complete, the def operands is immediately freed, marking the end of the |
| 154 | +VReg's liverange. It is removed from the `live_vregs` set, its allocation |
| 155 | +in `vreg_allocs` is set to none, and if it was in a PReg, that PReg's |
| 156 | +entry in `vreg_in_preg` is set to none. The selection and eviction phases |
| 157 | +are omitted if the operand has a fixed constraint, as those phases have |
| 158 | +already been carried out. |
| 159 | + |
| 160 | +Next, the selection, assignment, and eviction phases are carried out for all |
| 161 | +use operands. As with def operands, the selection and eviction phases are |
| 162 | +omitted if the operand has a fixed constraint, as those phases have already |
| 163 | +been carried out. |
| 164 | + |
| 165 | +Then the edit insertion phase is carried out for all use operands. |
| 166 | + |
| 167 | +Lastly, if the instruction being processed is a branch instruction, the |
| 168 | +parallel move resolver is used to insert edits before the instruction |
| 169 | +to move from the branch arguments spillslots to the block parameter |
| 170 | +spillslots. |
| 171 | + |
| 172 | +## Operand Allocation |
| 173 | + |
| 174 | +During the allocation of an operand, a check is first made to |
| 175 | +see if the VReg's current allocation as indicated in |
| 176 | +`vreg_allocs` is within the operand constraints. |
| 177 | + |
| 178 | +If it is, the assignment phase is carried out, setting the final |
| 179 | +allocation output's entry for that operand to the allocation. |
| 180 | +The selection phase is carried out, marking the PReg |
| 181 | +(if the allocation is a PReg) as unavailable in the respective |
| 182 | +early/late sets. The state of the LRUs is also updated to reflect |
| 183 | +the new most recently used PReg. |
| 184 | +No eviction needs to be done since the VReg is already in the |
| 185 | +allocation and no edit insertion needs to be done either. |
| 186 | + |
| 187 | +On the other hand, if the VReg's current allocation is not within |
| 188 | +constraints, the selection and eviction phases are carried out for |
| 189 | +non-fixed operands. First, a set of PRegs that can be drawn from is |
| 190 | +created from `available_pregs`. For early uses and late defs, |
| 191 | +this draw-from set is the early set or late set respectively. |
| 192 | +For late uses and early defs, the draw-from set is an intersection |
| 193 | +of the available early and late sets (because a PReg used for a late |
| 194 | +use can't be reassigned to another operand in the early phase; |
| 195 | +likewise, a PReg used for an early def can't be reassigned to another |
| 196 | +operand in the late phase). |
| 197 | +The LRU for the VReg's regclass is then traversed from the end to find |
| 198 | +the least-recently used PReg in the draw-from set. Once a PReg is found, |
| 199 | +it is marked as the most recently used in the LRU, unavailable in the |
| 200 | +`available_pregs` sets, and whatever VReg was in it before is evicted. |
| 201 | + |
| 202 | +The assignment phase is carried out next: the final allocation for the |
| 203 | +operand is set to the selected register. |
| 204 | + |
| 205 | +If the newly allocated operand has not been allocated before, that is, |
| 206 | +this is the first use/def of the VReg encountered, the VReg is |
| 207 | +inserted into `live_vregs` and marked as the value in the allocated |
| 208 | +PReg in `vreg_in_preg`. |
| 209 | + |
| 210 | +Otherwise, if the VReg has been allocated before, then an edit will need |
| 211 | +to be inserted to ensure that the dataflow remains correct. |
| 212 | +The edit insertion phase is now carried out if the operand is a def |
| 213 | +operand: an edit is inserted after the instruction to move from the |
| 214 | +new allocation to the allocation it's expected to be in after the |
| 215 | +instruction. |
| 216 | + |
| 217 | +The edit insertion phase for use operands is done after all operands |
| 218 | +have been processed. Edits are inserted to move from the current |
| 219 | +allocations in `vreg_allocs` to the final allocated position before |
| 220 | +the instruction. This is to account for the possibility of multiple |
| 221 | +uses of the same operand in the instruction. |
| 222 | + |
| 223 | +## Reuse Operands |
| 224 | + |
| 225 | +Reuse def operands are handled by creating a new operand identical to the |
| 226 | +reuse def, except that its constraints are the constraints of the |
| 227 | +reused input and allocating that in its place. |
| 228 | + |
| 229 | +Reused inputs are handled by creating a new operand with a fixed register |
| 230 | +constraint to use whatever register was assigned to the reuse def. |
| 231 | + |
| 232 | +Because of the way reuse operands and reused inputs are handled, when |
| 233 | +selecting a register for an early use operand with a fixed constraint, |
| 234 | +the PReg is also marked as unavailable in the `available_pregs` late |
| 235 | +set if the operand is a reused input. And when selecting a register |
| 236 | +for reuse def operands, the selected register is marked as unavailable |
| 237 | +in the `available_pregs` early set. |
| 238 | + |
| 239 | +## VReg Spillslots |
| 240 | + |
| 241 | +Whenever a VReg needs a spillslot, a suitable one is allocated and |
| 242 | +marked as the VReg's dedicated spillslot in `vreg_spillslots`. |
| 243 | +If a VReg never needs a spillslot, none is allocated for it. |
| 244 | +To ensure that a VReg will always be in its spillslot when expected, |
| 245 | +during the processing of a def operand, before it's deallocated, |
| 246 | +an edit is inserted to move from its current allocation as indicated |
| 247 | +in `vreg_allocs` to its dedicated spillslot, if one is present in |
| 248 | +`vreg_spillslots`. |
| 249 | + |
| 250 | +## Branch Instructions |
| 251 | + |
| 252 | +As an invariant, all branch arguments will be in their dedicated |
| 253 | +spillslots at the end of the block before edits are inserted to |
| 254 | +move from those spillslots to the block parameter spillslots |
| 255 | +of the successor blocks. |
| 256 | + |
| 257 | +If a branch argument is already in an allocation that isn't |
| 258 | +its spillslot (this could happen if the branch argument is used |
| 259 | +as an operand in the same instruction, because all normal |
| 260 | +instruction processing is completed before branch-specific |
| 261 | +processing), then an edit is inserted |
| 262 | +to move from the spillslot to that allocation and its current |
| 263 | +allocation in `vreg_allocs` is set to the spillslot. |
| 264 | + |
| 265 | +It's after these edits have been inserted that the parallel move |
| 266 | +resolver is then used to generate and insert edits to move from |
| 267 | +those spillslots to the spillslots of the block parameters. |
| 268 | + |
| 269 | +# Across Blocks |
| 270 | + |
| 271 | +When a block completes processing, some VRegs will still be live. |
| 272 | +These VRegs are either block parameters or livein VRegs. |
| 273 | +As an invariant, prior to the first instruction in a block, all |
| 274 | +block parameters and livein VRegs will be in their dedicated spillslots. |
| 275 | + |
| 276 | +To maintain this invariant, after a block completes processing, edits |
| 277 | +are inserted at the beginning of the block to move from the block |
| 278 | +parameter and livein spillslots to the allocation they are expected |
| 279 | +to be in from the first instruction. |
| 280 | +All block parameters are freed, just like defs, and liveins' current |
| 281 | +allocations in `vreg_allocs` are set to their spillslots. |
| 282 | + |
| 283 | +# Edits Order |
| 284 | + |
| 285 | +`regalloc2`'s outward interface guarantees that edits are in |
| 286 | +sorted order. Since allocation proceeds in reverse, all edits |
| 287 | +are also added in reverse. After all blocks have completed |
| 288 | +processing the edits are simply reversed to put it in the |
| 289 | +correct order. |
| 290 | + |
| 291 | +One of the reasons why the allocation order proceeds the way it |
| 292 | +does is because of this edit-order constraint. All edits that |
| 293 | +occur after the instruction must be inserted before all edits |
| 294 | +that occur before the instruction. |
| 295 | + |
| 296 | +# Debug Info |
| 297 | + |
| 298 | +After all blocks have completed processing, the debug locations |
| 299 | +vector is built. |
| 300 | +The information it's built from is assembled from liverange info |
| 301 | +that is tracked throughout the allocation. |
| 302 | +Whenever a VReg is allocated for the first time, its liverange end |
| 303 | +is saved in the VReg's slot in the `vreg_to_live_inst_range` |
| 304 | +vector. Whenever a VReg's definition is encountered, its liverange |
| 305 | +beginning is saved, too. And the allocation it will be in |
| 306 | +throughout that range is also saved alongside. |
| 307 | + |
| 308 | +To determine the allocation the VReg will be in throughout the |
| 309 | +liverange, the first invariant is used: the first time a VReg |
| 310 | +is allocated, its current allocation in `vreg_allocs` doesn't |
| 311 | +change unless its evicted or spilled across block boundaries. |
| 312 | +Using this info, if by the time the def of a VReg is allocated, |
| 313 | +that VReg has no dedicated spillslot, |
| 314 | +that implies that the VReg was never evicted or spilled, so whatever |
| 315 | +value its `vreg_allocs` entry says is the location it will be in |
| 316 | +throughout its liverange. Otherwise, if it has a spillslot |
| 317 | +allocated to it, that implies that the VReg was either evicted |
| 318 | +at some point or it was a livein of a predecessor or a block parameter. |
| 319 | +Either way, since all spillslots are dedicated to their respective VRegs, |
| 320 | +it is safe to record the spillslot as the allocation for the |
| 321 | +`vreg_to_live_inst_range` info. |
0 commit comments