-
Notifications
You must be signed in to change notification settings - Fork 113
Commit 3c3d440
authored
Introducing local register allocation for the tier-1 JIT compiler (#341)
Local register allocation (RA) effectively reuses the host register
value within a basic block scope, thereby reducing the number of load
and store instructions.
Take continuous addi instructions as an example:
addi t0, t0, 1
addi t0, t0, 1
addi t0, t0, 1
The generated machine code without register allocation
load t0, t0_addr
add t0, 1
sw t0, t0_addr
load t0, t0_addr
add t0, 1
sw t0, t0_addr
load t0, t0_addr
add t0, 1
sw t0, t0_addr
The generated machine code without register allocation
load t0, t0_addr
add t0, 1
add t0, 1
add t0, 1
sw t0, t0_addr
As shown in the above example, register allocation reuses the host
register and reduces the number of load and store instructions.
* x86-64(i7-11700)
| Metric | w/o RA | w/ RA | SpeedUp |
|-----------+----------+----------+---------|
| dhrystone | 0.342 s | 0.328 s | +4.27% |
| miniz | 1.243 s | 1.185 s | +4.89% |
| primes | 1.716 s | 1.689 s | +1.60% |
| sha512 | 2.063 s | 1.880 s | +9.73% |
| stream | 11.619 s | 11.419 s | +1.75% |
* Aarch64 (eMag)
| Metric | w/o RA | w/ RA | SpeedUp |
|-----------+----------+----------+---------|
| dhrystone | 1.935 s | 1.301 s | +48.73% |
| miniz | 7.706 s | 4.362 s | +76.66% |
| primes | 10.513 s | 9.633 s | +9.14% |
| sha512 | 6.508 s | 6.119 s | +6.36% |
| stream | 45.174 s | 38.037 s | +18.76% |
As demonstrated in the performance analysis, the register allocation
improves the overall performance for the T1C generated machine code.
Without RA, the generated machine need to store back the register
value in the end of intruction. With RA, we only need to store back the
register value in the end of basic block or when host registers are
fully occupied. The performance enhancement is particularly pronounced
on Aarch64 due to its increased availability of registers, providing a
more extensive mapping capability for VM registers.1 parent 1c9da01 commit 3c3d440Copy full SHA for 3c3d440
File tree
Expand file treeCollapse file tree
4 files changed
+670
-479
lines changedFilter options
- src
- tools
Expand file treeCollapse file tree
4 files changed
+670
-479
lines changed
0 commit comments