@@ -660,42 +660,122 @@ Non-Integral Pointer Type
660660Note: non-integral pointer types are a work in progress, and they should be
661661considered experimental at this time.
662662
663- LLVM IR optionally allows the frontend to denote pointers in certain address
664- spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
665- Non-integral pointer types represent pointers that have an *unspecified* bitwise
666- representation; that is, the integral representation may be target dependent or
667- unstable (not backed by a fixed integer).
663+ For most targets, the pointer representation is a direct mapping from the
664+ bitwise representation to the address of the underlying memory allocation.
665+ Such pointers are considered "integral", and any pointers where the
666+ representation is not just an integer address are called "non-integral".
667+
668+ In most cases pointers with a non-integral representation behave exactly the
669+ same as an integral pointer, the only difference is that it is not possible to
670+ create a pointer just from an address unless all the non-address bits were
671+ also recreated correctly in a target-specific way.
672+ Since the address width of a non-integral pointer is not equal to the bitwise
673+ representation, extracting the address will need to truncate to the index width
674+ of the pointer.
675+ An example of such a non-integral pointer representation are the AMDGPU buffer
676+ descriptors which are a 128-bit fat pointer and a 32-bit offset.
677+
678+ Additionally, LLVM IR optionally allows the frontend to denote pointers in
679+ certain address spaces as "unstable" or having "external state"
680+ (or combinations of these) via the :ref:`datalayout string<langref_datalayout>`.
681+
682+ The exact implications of these properties are target-specific, but the
683+ following IR semantics and restrictions to optimization passes apply:
684+
685+ Unstable pointer representation
686+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
687+
688+ Pointers in this address space have an *unspecified* bitwise representation
689+ (i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
690+ allowed to change in a target-specific way. For example, this could be a pointer
691+ type used with copying garbage collection where the garbage collector could
692+ update the pointer at any time in the collection sweep.
668693
669694``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
670695integral (i.e., normal) pointers in that they convert integers to and from
671- corresponding pointer types, but there are additional implications to be
672- aware of. Because the bit-representation of a non-integral pointer may
673- not be stable, two identical casts of the same operand may or may not
696+ corresponding pointer types, but there are additional implications to be aware
697+ of.
698+
699+ For "unstable" pointer representations, the bit-representation of the pointer
700+ may not be stable, so two identical casts of the same operand may or may not
674701return the same value. Said differently, the conversion to or from the
675- non-integral type depends on environmental state in an implementation
702+ "unstable" pointer type depends on environmental state in an implementation
676703defined manner.
677-
678704If the frontend wishes to observe a *particular* value following a cast, the
679705generated IR must fence with the underlying environment in an implementation
680706defined manner. (In practice, this tends to require ``noinline`` routines for
681707such operations.)
682708
683709From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
684- non-integral types are analogous to ones on integral types with one
710+ "unstable" pointer types are analogous to ones on integral types with one
685711key exception: the optimizer may not, in general, insert new dynamic
686712occurrences of such casts. If a new cast is inserted, the optimizer would
687713need to either ensure that a) all possible values are valid, or b)
688714appropriate fencing is inserted. Since the appropriate fencing is
689715implementation defined, the optimizer can't do the latter. The former is
690716challenging as many commonly expected properties, such as
691- ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
717+ ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
692718Similar restrictions apply to intrinsics that might examine the pointer bits,
693719such as :ref:`llvm.ptrmask<int_ptrmask>`.
694720
695- The alignment information provided by the frontend for a non-integral pointer
721+ The alignment information provided by the frontend for an "unstable" pointer
696722(typically using attributes or metadata) must be valid for every possible
697723representation of the pointer.
698724
725+ Non-integral pointers with external state
726+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
727+
728+ A further special case of non-integral pointers is ones that include external
729+ state (such as bounds information or a type tag) with a target-defined size.
730+ An example of such a type is a CHERI capability, where there is an additional
731+ validity bit that is part of all pointer-typed registers, but is located in
732+ memory at an implementation-defined address separate from the pointer itself.
733+ Another example would be a fat-pointer scheme where pointers remain plain
734+ integers, but the associated bounds are stored in an out-of-band table.
735+
736+ Unless also marked as "unstable", the bit-wise representation of pointers with
737+ external state is stable and ``ptrtoint(x)`` always yields a deterministic
738+ value. This means transformation passes are still permitted to insert new
739+ ``ptrtoint`` instructions.
740+
741+ The following restrictions apply to IR level optimization passes:
742+
743+ The ``inttoptr`` instruction does not recreate the external state and therefore
744+ it is target dependent whether it can be used to create a dereferenceable
745+ pointer. In general passes should assume that the result of such an inttoptr
746+ is not dereferenceable. For example, on CHERI targets an ``inttoptr`` will
747+ yield a capability with the external state (the validity tag bit) set to zero,
748+ which will cause any dereference to trap.
749+ The ``ptrtoint`` instruction also only returns the "in-band" state and omits
750+ all external state.
751+ These two properties mean that ``inttoptr(ptrtoint(x))`` cannot be folded to
752+ ``x`` since the ``ptrtoint`` operation does not include the external state
753+ needed to reconstruct the original pointer and ``inttoptr`` cannot set it.
754+
755+ When a ``store ptr addrspace(N) %p, ptr @dst`` of such a non-integral pointer
756+ is performed, the external metadata is also stored to an implementation-defined
757+ location. Similarly, a ``%val = load ptr addrspace(N), ptr @dst`` will fetch the
758+ external metadata and make it available for all uses of ``%val``.
759+ Similarly, the ``llvm.memcpy`` and ``llvm.memmove`` intrinsics also transfer the
760+ external state. This is essential to allow frontends to efficiently emit copies
761+ of structures containing such pointers, since expanding all these copies as
762+ individual loads and stores would affect compilation speed and inhibit
763+ optimizations.
764+
765+ Notionally, these external bits are part of the pointer, but since
766+ ``inttoptr`` / ``ptrtoint``` only operate on the "in-band" bits of the pointer
767+ and the external bits are not explicitly exposed, they are not included in the
768+ size specified in the :ref:`datalayout string<langref_datalayout>`.
769+
770+ When a pointer type has external state, all roundtrips via memory must
771+ be performed as loads and stores of the correct type since stores of other
772+ types may not propagate the external data.
773+ Therefore it is not legal to convert an existing load/store (or a
774+ ``llvm.memcpy`` / ``llvm.memmove`` intrinsic) of pointer types with external
775+ state to a load/store of an integer type with same bitwidth, as that may drop
776+ the external state.
777+
778+
699779.. _globalvars:
700780
701781Global Variables
@@ -3179,8 +3259,8 @@ as follows:
31793259``A<address space>``
31803260 Specifies the address space of objects created by '``alloca``'.
31813261 Defaults to the default address space of 0.
3182- ``p[n ]:<size>:<abi>[:<pref>[:<idx>]]``
3183- This specifies the properties of a pointer in address space ``n ``.
3262+ ``p[<flags>][<as> ]:<size>:<abi>[:<pref>[:<idx>]]``
3263+ This specifies the properties of a pointer in address space ``as ``.
31843264 The ``<size>`` parameter specifies the size of the bitwise representation.
31853265 For :ref:`non-integral pointers <nointptrtype>` the representation size may
31863266 be larger than the address width of the underlying address space (e.g. to
@@ -3193,9 +3273,14 @@ as follows:
31933273 default index size is equal to the pointer size.
31943274 The index size also specifies the width of addresses in this address space.
31953275 All sizes are in bits.
3196- The address space, ``n``, is optional, and if not specified,
3197- denotes the default address space 0. The value of ``n`` must be
3198- in the range [1,2^24).
3276+ The address space, ``<as>``, is optional, and if not specified, denotes the
3277+ default address space 0. The value of ``<as>`` must be in the range [1,2^24).
3278+ The optional ``<flags>`` are used to specify properties of pointers in this
3279+ address space: the character ``u`` marks pointers as having an unstable
3280+ representation, ``n`` marks pointers as non-integral (i.e. having
3281+ additional metadata), ``e`` marks pointers having external state
3282+ (``n`` must also be set). See :ref:`Non-Integral Pointer Types <nointptrtype>`.
3283+
31993284``i<size>:<abi>[:<pref>]``
32003285 This specifies the alignment for an integer type of a given bit
32013286 ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
@@ -3248,9 +3333,11 @@ as follows:
32483333 this set are considered to support most general arithmetic operations
32493334 efficiently.
32503335``ni:<address space0>:<address space1>:<address space2>...``
3251- This specifies pointer types with the specified address spaces
3252- as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0``
3253- address space cannot be specified as non-integral.
3336+ This marks pointer types with the specified address spaces
3337+ as :ref:`non-integral and unstable <nointptrtype>`.
3338+ The ``0`` address space cannot be specified as non-integral.
3339+ It is only supported for backwards compatibility, the flags of the ``p``
3340+ specifier should be used instead for new code.
32543341
32553342``<abi>`` is a lower bound on what is required for a type to be considered
32563343aligned. This is used in various places, such as:
@@ -31402,4 +31489,3 @@ Semantics:
3140231489
3140331490The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
3140431491as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.
31405-
0 commit comments