Skip to content

Commit 96bdae0

Browse files
committed
Range extension thunks
We are considering three use cases here. 1. A true large code model needs to support more than 2 GiB of text; data accesses are out of scope for this change but jumps and calls across a range of more than 2 GiB are needed. Most users of a large model will have more than 2 GiB of data but small text, or text with a highly local call pattern, so we want most calls to be able to use the auipc+jalr sequence. This would normally call for relaxation, but relaxation requires object files to contain the longest possible sequence, of which several are possible. Instead, keep the sequences the same and allow thunk insertion. 2. For executables and shared objects in a Unix environment, most of the code size benefits of relaxation come from call->jal relaxation, not data or TLS relaxation. If the compiler is modified to generate jal instructions instead of call instructions, the code size benefits can be achieved without relaxation at all, but this requires JAL_THUNK to avoid relocation errors at a 1 MiB limit. 3. If a function has many static call sites in a large binary but is known to be dynamically cold, due to a function attribute or PGO, the call sites can be replaced with jal instructions, sharing a single thunk between all call sites within a 2 MiB text region. This saves code size at small runtime cost. Restricting the register usage of the thunks is an intentional feature copied from the Go 1.15 toolchain, where every non-leaf function requires a conditional call to runtime.morestack in the prologue; since ra cannot be saved before the stack frame is allocated, the call is performed using t0 as the return register.
1 parent 5ffe5b5 commit 96bdae0

File tree

1 file changed

+82
-27
lines changed

1 file changed

+82
-27
lines changed

riscv-elf.adoc

Lines changed: 82 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,11 @@ Description:: Additional information about the relocation
503503
<| S - P
504504
.2+| 65 .2+| TLSDESC_CALL .2+| Static | .2+| Annotate call to TLS descriptor resolver function, `%tlsdesc_call(address of %tlsdesc_hi)`, for relaxation purposes only
505505
<|
506-
.2+| 66-191 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use
506+
.2+| 66 .2+| JAL_THUNK .2+| Static | _J-Type_ .2+| 20-bit PC-relative jump, allowed to use a range extension thunk
507+
<| S + A - P
508+
.2+| 67 .2+| CALL_THUNK .2+| Static | _U+I-Type_ .2+| 32-bit PC-relative function call, allowed to use a range extension thunk
509+
<| S + A - P
510+
.2+| 68-191 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use
507511
<|
508512
.2+| 192-255 .2+| *Reserved* .2+| - | .2+| Reserved for nonstandard ABI extensions
509513
<|
@@ -688,16 +692,17 @@ and fills in the GOT entry for subsequent calls to the function:
688692

689693
==== Procedure Calls
690694

691-
`R_RISCV_CALL` and `R_RISCV_CALL_PLT` relocations are associated with
692-
pairs of instructions (`AUIPC+JALR`) generated by the `CALL` or `TAIL`
693-
pseudoinstructions. Originally, these relocations had slightly different
694-
behavior, but that has turned out to be unnecessary, and they are now
695-
interchangeable, `R_RISCV_CALL` is deprecated, suggest using `R_RISCV_CALL_PLT`
696-
instead.
695+
`R_RISCV_CALL`, `R_RISCV_CALL_PLT`, and `R_RISCV_CALL_THUNK` relocations are
696+
associated with pairs of instructions (`AUIPC+JALR`) generated by the `CALL` or
697+
`TAIL` pseudoinstructions. Originally, these relocations had slightly
698+
different behavior, but that has turned out to be unnecessary, and they are now
699+
interchangeable, `R_RISCV_CALL` is deprecated, suggest using
700+
`R_RISCV_CALL_PLT` instead.
697701

698-
With linker relaxation enabled, the `AUIPC` instruction in the `AUIPC+JALR` pair has
699-
both a `R_RISCV_CALL` or `R_RISCV_CALL_PLT` relocation and an `R_RISCV_RELAX`
700-
relocation indicating the instruction sequence can be relaxed during linking.
702+
With linker relaxation enabled, the `AUIPC` instruction in the `AUIPC+JALR`
703+
pair has both a `R_RISCV_CALL`, `R_RISCV_CALL_PLT`, or `R_RISCV_CALL_THUNK`
704+
relocation and an `R_RISCV_RELAX` relocation indicating the instruction
705+
sequence can be relaxed during linking.
701706

702707
Procedure call linker relaxation allows the `AUIPC+JALR` pair to be relaxed
703708
to the `JAL` instruction when the procedure or PLT entry is within (-1MiB to
@@ -735,6 +740,55 @@ that can represent an even signed 21-bit offset (-1MiB to +1MiB-2).
735740
Branch (SB-Type) instructions have a `R_RISCV_BRANCH` relocation that
736741
can represent an even signed 13-bit offset (-4096 to +4094).
737742

743+
==== Range Extension Thunks
744+
745+
`R_RISCV_JAL_THUNK` and `R_RISCV_CALL_THUNK` relocations may be resolved by the
746+
linker to point to a range extension thunk instead of the target symbol. Range
747+
extension thunks will eventually transfer control to the target symbol, and
748+
preserve the contents of memory and all registers except for `t1` and `t2`.
749+
750+
[NOTE]
751+
.Suggested forms of range extension thunks
752+
====
753+
20-bit range:
754+
755+
[,asm]
756+
----
757+
jal zero, <offset to target>
758+
----
759+
760+
32-bit range:
761+
762+
[,asm]
763+
----
764+
auipc t2, <high offset to target>
765+
jalr zero, t2, <low offset to target>
766+
----
767+
768+
64-bit range, position dependent:
769+
770+
[,asm]
771+
----
772+
auipc t2, <high offset to literal>
773+
ld t2, <low offset to literal>(t2)
774+
jalr zero, t2, 0 OR c.jr t2
775+
...
776+
.quad 0
777+
----
778+
779+
64-bit range, position independent:
780+
781+
[,asm]
782+
----
783+
auipc t1, <high offset to literal>
784+
ld t2, <low offset to literal>(t1)
785+
add t2, t2, t1 OR c.add t2, t1
786+
jalr zero, t2, 0 OR c.jr t2
787+
...
788+
.quad <offset to target from auipc result>
789+
----
790+
====
791+
738792
==== PC-Relative Symbol Addresses
739793

740794
32-bit PC-relative relocations for symbol addresses on sequences of
@@ -1454,17 +1508,17 @@ which made the load instruction reference to an unspecified address.
14541508

14551509
==== Function Call Relaxation
14561510

1457-
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT.
1511+
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT, R_RISCV_CALL_THUNK.
14581512

14591513
Description:: This relaxation type can relax `AUIPC+JALR` into `JAL`.
14601514

14611515
Condition:: The offset between the location of relocation and target symbol or
14621516
the PLT stub of the target symbol is within +-1MiB.
14631517

14641518
Relaxation::
1465-
- Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
1466-
can be rewritten to a single JAL instruction with the offset between the
1467-
location of relocation and target symbol.
1519+
- Instruction sequence associated with `R_RISCV_CALL`, `R_RISCV_CALL_PLT`,
1520+
or `R_RISCV_CALL_THUNK` can be rewritten to a single JAL instruction with
1521+
the offset between the location of relocation and target symbol.
14681522

14691523
Example::
14701524
+
@@ -1490,7 +1544,7 @@ symbol.
14901544
[[compress-func-call-relax]]
14911545
==== Compressed Function Call Relaxation
14921546

1493-
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT.
1547+
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT, R_RISCV_CALL_THUNK.
14941548

14951549
Description:: This relaxation type can relax `AUIPC+JALR` into `C.JAL`
14961550
instruction sequence.
@@ -1500,9 +1554,9 @@ symbol.
15001554
instruction in the instruction sequence is `X1`/`RA` and if it is RV32.
15011555

15021556
Relaxation::
1503-
- Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
1504-
can be rewritten to a single `C.JAL` instruction with the offset between the
1505-
location of relocation and target symbol.
1557+
- Instruction sequence associated with `R_RISCV_CALL`, `R_RISCV_CALL_PLT`,
1558+
or `R_RISCV_CALL_THUNK` can be rewritten to a single `C.JAL` instruction with
1559+
the offset between the location of relocation and target symbol.
15061560

15071561
Example::
15081562
+
@@ -1524,7 +1578,7 @@ Relaxation result:
15241578
[[compress-tailcall-relax]]
15251579
==== Compressed Tail Call Relaxation
15261580

1527-
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT.
1581+
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT, R_RISCV_CALL_THUNK
15281582

15291583
Description:: This relaxation type can relax `AUIPC+JALR` into `C.J`
15301584
instruction sequence.
@@ -1534,9 +1588,9 @@ Relaxation result:
15341588
instruction in the instruction sequence is `X0`.
15351589

15361590
Relaxation::
1537-
- Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
1538-
can be rewritten to a single `C.J` instruction with the offset between the
1539-
location of relocation and target symbol.
1591+
- Instruction sequence associated with `R_RISCV_CALL`, `R_RISCV_CALL_PLT`,
1592+
or `R_RISCV_CALL_THUNK` can be rewritten to a single `C.J` instruction with
1593+
the offset between the location of relocation and target symbol.
15401594

15411595
Example::
15421596
+
@@ -1912,7 +1966,8 @@ Relaxation result (short form):
19121966

19131967
==== Table Jump Relaxation
19141968

1915-
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT, R_RISCV_JAL.
1969+
Target Relocation::: R_RISCV_CALL, R_RISCV_CALL_PLT, R_RISCV_CALL_THUNK,
1970+
R_RISCV_JAL, R_RISCV_JAL_THUNK.
19161971

19171972
Description:: This relaxation type can relax a function call or jump
19181973
instruction into a single table jump instruction with the index of the target
@@ -1933,10 +1988,10 @@ Relaxation result (short form):
19331988
is `X0` or `RA`.
19341989

19351990
Relaxation::
1936-
- Instruction sequence associated with `R_RISCV_CALL` or `R_RISCV_CALL_PLT`
1937-
can be rewritten to a table jump instruction.
1938-
- Instruction associated with `R_RISCV_JAL` can be rewritten to a table
1939-
jump instruction.
1991+
- Instruction sequence associated with `R_RISCV_CALL`, `R_RISCV_CALL_PLT`,
1992+
or `R_RISCV_CALL_THUNK` can be rewritten to a table jump instruction.
1993+
- Instruction associated with `R_RISCV_JAL` or `R_RISCV_JAL_THUNK` can be
1994+
rewritten to a table jump instruction.
19401995
Example::
19411996
+
19421997
--

0 commit comments

Comments
 (0)