@@ -480,6 +480,10 @@ The following nomenclature is used in the descriptions of relocation operations:
480480 the second entry holds a platform-specific offset or pointer. The pair of
481481 pointer-sized entries will be relocated with ``R_MORELLO_TLSDESC(S+A) ``.
482482
483+ - ``GTGOTTLSDESC(S+A) `` represents a consecutive pair of pointer-sized entries
484+ as the indirect TLS version of ``GTLSDESC(A) ``. The pair of pointer-sized
485+ entries will be relocated with ``R_MORELLO_TGOT_TLSDESC(S+A) ``.
486+
483487- ``Delta(S) `` if ``S `` is a normal symbol, resolves to the difference between the
484488 static link address of ``S `` and the execution address of ``S ``. If ``S `` is the
485489 null symbol (ELF symbol index 0), resolves to the difference between the static
@@ -489,13 +493,22 @@ The following nomenclature is used in the descriptions of relocation operations:
489493 contains the offset in the static TLS block of the thread-local symbol ``S ``.
490494 The second value contains the size of the symbol ``S ``
491495
496+ - ``TGOTREL(S) `` resolves to the offset of the thread-local symbol ``S `` in the
497+ static TLS TGOT.
498+
492499- ``GTPREL(S) `` represents an entry in the GOT containing a pair of two 64-bit
493500 values. The first value contains the offset in the static TLS block of the
494501 symbol ``S ``. The second value contains the size of the symbol ``S ``.
495502
503+ - ``GTGOTREL(S) `` represents an entry in the GOT containing the offset of the
504+ thread-local symbol ``S `` in the static TLS TGOT.
505+
496506- ``TLSDESC(S+A) `` resolves to a contiguous pair of pointer-sized values, as
497507 created by GTLSDESC(S+A).
498508
509+ - ``TGOTTLSDESC(S+A) `` resolves to a contiguous pair of pointer-sized values,
510+ as created by GTGOTTLSDESC(S+A).
511+
499512- ``CAP_INIT `` generates a capability with all required information. When used on
500513 its own represents the operations needs to be done for handling ``R_MORELLO_CAPINIT ``.
501514
@@ -648,8 +661,8 @@ Interworking between ABIs are not supported yet.
648661 | | | | Also see `Static linking with Morello `_. |
649662 +-------+---------------------------------+------------------------+-----------------------------------------------------------+
650663
651- Relocations for thread-local storage
652- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
664+ Relocations for thread-local storage (direct)
665+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
653666
654667Morello only defines the relocations needed to implement the descriptor based
655668thread-local storage (TLS) models in a SysV-type environment. The details of
@@ -662,7 +675,7 @@ Relocations needed to define the traditional TLS models are undefined.
662675
663676.. class :: aaelf64-morello-tls-descriptor-relocations
664677
665- .. table :: TLS descriptor relocations
678+ .. table :: TLS descriptor relocations (direct)
666679
667680 +-------+-----------------------------------------+----------------------------+-----------------------------------------------------------+
668681 | ELF64 | Name | Operation | Comment |
@@ -688,6 +701,66 @@ Relocations needed to define the traditional TLS models are undefined.
688701 | | | | |
689702 +-------+-----------------------------------------+----------------------------+-----------------------------------------------------------+
690703
704+ Relocations for thread-local storage (indirect)
705+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
706+
707+ As with direct thread-local storage, Morello uses TLS descriptors. The details
708+ of indirect TLS are beyond the scope of this specification; they are part of
709+ [CHERI_ELF _].
710+
711+ .. class :: aaelf64-morello-indirect-tls-descriptor-relocations
712+ .. table :: TLS descriptor relocations (indirect)
713+
714+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
715+ | ELF64 | Name | Operation | Comment |
716+ | Code | | | |
717+ +=======+==========================================+================================+=========================================================================+
718+ | 57616 | ``R_MORELLO_TLSIE_ADR_GOTTGOT_PAGE20 `` | ``Page(G(GTGOTREL(S+A))) `` | Set the immediate value of an ADRP to bits [31:12] of X. |
719+ | | | ``- Page(P) `` | Check that -2\ :sup: `31` <= X < 2\ :sup: `31`. |
720+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
721+ | 57617 | ``R_MORELLO_TLSIE_LD64_GOTTGOT_LO12_NC `` | ``G(GTGOTREL(S+A)) `` | Set the LD/ST immediate field to bits [11:3] of X. |
722+ | | | | No overflow check. Check that X&7 = 0. |
723+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
724+ | 57618 | ``R_MORELLO_TLSLE_MOVW_TGOT_G1 `` | ``TGOTREL(S+A) `` | Set the MOV[NZ] immediate field to bits [31:16] of X (see notes below). |
725+ | | | | Check that -2\ :sup: `32` <= X < 2\ :sup: `32`. |
726+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
727+ | 57619 | ``R_MORELLO_TLSLE_MOVW_TGOT_G0 `` | ``TGOTREL(S+A) `` | Set the MOV[NZ] immediate field to bits [15:0] of X (see notes below). |
728+ | | | | Check that -2\ :sup: `16` <= X < 2\ :sup: `16`. |
729+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
730+ | 57620 | ``R_MORELLO_TLSLE_MOVW_TGOT_G0_NC `` | ``TGOTREL(S+A) `` | Set the MOVK immediate field to bits [15:0] of X. |
731+ | | | | No overflow check. |
732+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
733+ | 57621 | ``R_MORELLO_TLSLE_ADD_TGOT_HI12 `` | ``TGOTREL(S+A) `` | Set the ADD immediate field to bits [23:12] of X. |
734+ | | | | Check 0 <= X < 2\ :sup: `24`. |
735+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
736+ | 57622 | ``R_MORELLO_TLSLE_LD128_TGOT_LO12 `` | ``TGOTREL(S+A) `` | Set the LD/ST immediate field to bits [11:4] of X. |
737+ | | | | Check that 0 <= X < 2\ :sup: `12`. X&15 = 0. |
738+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
739+ | 57623 | ``R_MORELLO_TLSLE_LD128_TGOT_LO12_NC `` | ``TGOTREL(S+A) `` | Set the LD/ST immediate field to bits [11:4] of X. |
740+ | | | | No overflow check. Check that X&15 = 0. |
741+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
742+ | 57624 | ``R_MORELLO_TGOT_TLSDESC_ADR_PAGE20 `` | ``Page(G(GTGOTTLSDESC(S+A))) `` | Set the immediate value of an ADRP to bits [31:12] of X. |
743+ | | | ``- Page(P) `` | Check that -2\ :sup: `31` <= X < 2\ :sup: `31`. |
744+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
745+ | 57625 | ``R_MORELLO_TGOT_TLSDESC_LD128_LO12 `` | ``G(GTGOTTLSDESC(S+A)) `` | Set the LD/ST immediate field to bits [11:4] of X. |
746+ | | | | Check that 0 <= X < 2\ :sup: `12`. X&15 = 0. |
747+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
748+ | 57626 | ``R_MORELLO_TGOT_TLSDESC_ADD_LO12 `` | ``G(GTGOTTLSDESC(S+A)) `` | Set the ADD immediate field to bits [11:0] of X. |
749+ | | | | No overflow check. |
750+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
751+ | 57627 | ``R_MORELLO_TGOT_TLSDESC_CALL `` | None | For relaxation only. |
752+ | | | | Must be used to identify a ``BLR `` instruction which performs an |
753+ | | | | indirect call to the TLS descriptor function for ``S + A ``. |
754+ +-------+------------------------------------------+--------------------------------+-------------------------------------------------------------------------+
755+
756+ .. note ::
757+
758+ These checking forms relocate ``MOVN `` or ``MOVZ ``.
759+
760+ X >= 0: Set the instruction to ``MOVZ `` and its immediate field to the selected bits of X.
761+
762+ X < 0: Set the instruction to ``MOVN `` and its immediate field to NOT (selected bits of X).
763+
691764Dynamic Morello relocations
692765^^^^^^^^^^^^^^^^^^^^^^^^^^^
693766
@@ -731,6 +804,16 @@ Dynamic Morello relocations
731804 | 59401 | ``R_AARCH64_FUNC_RELATIVE `` | ``Delta(S) + A `` | See note below. |
732805 | | | | |
733806 +-------+-----------------------------+-----------------------------------------+------------------------------------------+
807+ | 59402 | ``R_MORELLO_TLS_TGOT_SLOT `` | ``CAP_INIT(S, A, CAP_SIZE, CAP_PERM) `` | See note below |
808+ | | | | |
809+ +-------+-----------------------------+-----------------------------------------+------------------------------------------+
810+ | 59403 | ``R_MORELLO_TLS_TGOTREL64 `` | ``TGOTREL(S) `` | See note below |
811+ | | | | |
812+ +-------+-----------------------------+-----------------------------------------+------------------------------------------+
813+ | 59404 | ``R_MORELLO_TGOT_TLSDESC `` | ``TGOTTLSDESC(S+A) `` | Identifies an indirect TLS descriptor to |
814+ | | | | be filled. |
815+ +-------+-----------------------------+-----------------------------------------+------------------------------------------+
816+
734817
735818.. note ::
736819
@@ -792,6 +875,20 @@ Dynamic Morello relocations
792875 relocated is a function pointer as opposed to a data pointer or code pointer.
793876 This distinction is needed for library-based compartmentalization (c18n).
794877
878+ ``R_MORELLO_TLS_TGOT_SLOT `` is similar to ``R_MORELLO_CAPINIT `` but should
879+ not be applied to the TGOT template itself. Instead, represents
880+ initialisation to be performed per TGOT. If ``S `` is the null symbol (ELF
881+ symbol index 0), it contains a fragment in the same format as
882+ ``R_MORELLO_RELATIVE ``.
883+
884+ ``R_MORELLO_TLS_TGOTREL64 `` instructs the dynamic loader to create a 64-bit
885+ integer containing the offset of ``S `` in the static TLS TGOT.
886+
887+ ``R_MORELLO_TGOT_TLSDESC `` refers to the TGOT entry which, like a GOT entry,
888+ is always local to the object. That is, like a relative relocation, ``S ``
889+ will be the null symbol (ELF symbol index 0) and the addend will be the
890+ offset within this object's TGOT.
891+
795892Static linking with Morello
796893^^^^^^^^^^^^^^^^^^^^^^^^^^^
797894
@@ -972,8 +1069,8 @@ veneer is used. The BX changes the execution state from A64 to C64:
9721069 add c16, c16, :lo12:sym
9731070 br c16
9741071
975- TLS for the pure capability ABI
976- -------------------------------
1072+ TLS for the pure capability ABI (direct)
1073+ ----------------------------------------
9771074
9781075The design is based on TLSDESC, with the purpose of minimizing the performance
9791076differences between A64 and C64, while providing strict bounds when resolving
@@ -1128,3 +1225,156 @@ Initial Exec.
11281225 Both the offset in the TLS block and the size of ``S `` are known by the
11291226 static linker and can be emitted as constants with no associated
11301227 ``R_MORELLO_TPREL128 `` relocation.
1228+
1229+ TLS for the pure capability ABI (indirect)
1230+ ------------------------------------------
1231+
1232+ The design is based on TLSDESC, using a TGOT (as described in [CHERI_ELF _]) for
1233+ indirection to minimise privilege and enable compartmentalisation.
1234+
1235+ TLS static block
1236+ ^^^^^^^^^^^^^^^^
1237+
1238+ The static block layout is the same as for direct TLS, except it stores the
1239+ TGOT rather than the TLS data. The location of the TLS data itself is
1240+ unspecified.
1241+
1242+ Thread pointer
1243+ ^^^^^^^^^^^^^^
1244+
1245+ The thread pointer has the same requirements as for direct TLS.
1246+
1247+ Resolver functions
1248+ ^^^^^^^^^^^^^^^^^^
1249+
1250+ A resolver function takes arguments in c0 (address of the TGOT TLS GOT slot),
1251+ and c1 (a copy of the thread pointer) and returns a pointer to the TLS global
1252+ in c0. The resolver has a custom calling convention that must preserve all
1253+ registers except c0.
1254+
1255+ Static TLS block resolver
1256+ ~~~~~~~~~~~~~~~~~~~~~~~~~
1257+
1258+ If the TGOT entry is in the static block, while resolving the
1259+ ``R_MORELLO_TGOT_TLSDESC `` relocation, the dynamic linker will place in the two
1260+ GOT slots associated with this variable:
1261+
1262+ - A capability to the static TLS block resolver function at offset 0.
1263+
1264+ - The offset of the TGOT entry in the static TLS block.
1265+
1266+ An implementation of the static block resolver could be the following:
1267+
1268+ .. code-block :: text
1269+
1270+ ldr x0, [c0, #16]
1271+ ldr c0, [c1, x0]
1272+ ret
1273+
1274+ Local Exec
1275+ ^^^^^^^^^^
1276+
1277+ The capability to the TLS variable is loaded from ``CTPIDR_EL0 ``. There are no
1278+ requirements on how this is performed or the registers used, except that the
1279+ sequence doesn't produce a dynamic relocation. A possible instruction sequence
1280+ could be:
1281+
1282+ .. code-block :: text
1283+
1284+ mrs c0, CTPIDR_EL0
1285+ add c0, c0, :tgot:local_exec_var, lsl #12
1286+ ldr c0, [c0, :tgot_lo12_nc:local_exec_var]
1287+
1288+ Initial Exec
1289+ ^^^^^^^^^^^^
1290+
1291+ The capability to the TLS variable is loaded from ``CTPIDR_EL0 ``. The offset of
1292+ the TGOT entry is stored in a GOT slot. This GOT slot is initialized by a
1293+ ``R_MORELLO_TLS_TGOTREL64 `` dynamic relocation. The access must use the
1294+ ``R_MORELLO_TLSIE_ADR_GOTTGOT_PAGE20 `` and
1295+ ``R_MORELLO_TLSIE_LD64_GOTTGOT_LO12_NC `` relocations. There are no other
1296+ requirements on how this is performed or the registers used. A possible
1297+ instruction sequence could be:
1298+
1299+ .. code-block :: text
1300+
1301+ adrp c0, :gottgot:initial_exec_var
1302+ ldr x0, [c0, :gottgot_lo12:initial_exec_var]
1303+ mrs c1, CTPIDR_EL0
1304+ ldr c0, [c1, x0]
1305+
1306+ Initial Exec to Local Exec relaxation
1307+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1308+
1309+ ``R_MORELLO_TLSIE_ADR_GOTTPREL_PAGE20 `` will be rewritten to a MOV[NZ] LSL #16
1310+ with an ``R_MORELLO_TLSLE_MOVW_TGOT_G1 ``.
1311+
1312+ ``R_MORELLO_TLSIE_LD64_GOTTGOT_LO12_NC `` will be rewritten to a MOVK with an
1313+ ``R_MORELLO_TLSLE_MOVW_TGOT_G0_NC ``.
1314+
1315+ When applied to the example Initial Exec sequence, this yields:
1316+
1317+ .. code-block :: text
1318+
1319+ movz x0, #:tgot_g1:initial_exec_var, lsl #16
1320+ movk x0, #:tgot_g0_nc:initial_exec_var
1321+ mrs c1, CTPIDR_EL0
1322+ ldr c0, [c1, x0]
1323+
1324+ General Dynamic
1325+ ^^^^^^^^^^^^^^^
1326+
1327+ The instruction sequence used for the General Dynamic access model is similar
1328+ to that of direct TLS. However, due to the shorter Initial Exec and Local Exec
1329+ instruction sequences, no additional NOP is present, and c1, not c2, is used to
1330+ pass the thread pointer, rather than being a call-clobbered register.
1331+
1332+ The General Dynamic access sequence must be output in the following form to
1333+ allow correct linker relaxation:
1334+
1335+ .. code-block :: text
1336+
1337+ adrp c0, :tgot_tlsdesc:sym
1338+ ldr c2, [c0, :tgot_tlsdesc_lo12:sym]
1339+ add c0, c0, :tgot_tlsdesc_lo12:sym
1340+ .tgot_tlsdesccall sym
1341+ blr c2
1342+
1343+ General Dynamic to Initial Exec relaxation
1344+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1345+
1346+ The relaxed sequence is:
1347+
1348+ .. code-block :: text
1349+
1350+ adrp c0, :gottgot:sym
1351+ ldr x0, [c0, :gottgot_lo12:sym]
1352+ ldr c0, [c1, x0]
1353+ nop
1354+
1355+ General Dynamic to Local Exec relaxation
1356+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1357+
1358+ The relaxed sequence is:
1359+
1360+ .. code-block :: text
1361+
1362+ movz x0, #:tgot_g1:sym, lsl #16
1363+ movk x0, #:tgot_g0_nc:sym
1364+ ldr c0, [c1, x0]
1365+ nop
1366+
1367+ TLS for the pure capability ABI (mixed / compat)
1368+ ------------------------------------------------
1369+
1370+ In order to support transitioning from direct to indirect TLS, both can be used
1371+ at once. In this mixed model, direct TLS is entirely as defined above. However,
1372+ for indirect TLS, a "compat" model is used, where the location of the static
1373+ TGOT within the static TLS block is unspecified. This has the following
1374+ implications:
1375+
1376+ - Local Exec is forbidden (including for relaxation); the most optimised access
1377+ model available is Initial Exec
1378+
1379+ - ``R_MORELLO_TLS_TGOTREL64 `` is never statically resolvable to a constant at
1380+ link time
0 commit comments