@@ -24019,6 +24019,130 @@ Examples:
2401924019 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> poison)
2402024020
2402124021
24022+ .. _int_loop_dependence_war_mask:
24023+
24024+ '``llvm.loop.dependence.war.mask.*``' Intrinsics
24025+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24026+
24027+ Syntax:
24028+ """""""
24029+ This is an overloaded intrinsic.
24030+
24031+ ::
24032+
24033+ declare <4 x i1> @llvm.loop.dependence.war.mask.v4i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24034+ declare <8 x i1> @llvm.loop.dependence.war.mask.v8i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24035+ declare <16 x i1> @llvm.loop.dependence.war.mask.v16i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24036+ declare <vscale x 16 x i1> @llvm.loop.dependence.war.mask.nxv16i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24037+
24038+
24039+ Overview:
24040+ """""""""
24041+
24042+ Given a vector load from %ptrA followed by a vector store to %ptrB, this
24043+ instruction generates a mask where an active lane indicates that the
24044+ write-after-read sequence can be performed safely for that lane, without the
24045+ danger of a write-after-read hazard occurring.
24046+
24047+ A write-after-read hazard occurs when a write-after-read sequence for a given
24048+ lane in a vector ends up being executed as a read-after-write sequence due to
24049+ the aliasing of pointers.
24050+
24051+ Arguments:
24052+ """"""""""
24053+
24054+ The first two arguments are pointers and the last argument is an immediate.
24055+ The result is a vector with the i1 element type.
24056+
24057+ Semantics:
24058+ """"""""""
24059+
24060+ ``%elementSize`` is the size of the accessed elements in bytes.
24061+ The intrinsic returns ``poison`` if the distance between ``%prtA`` and ``%ptrB``
24062+ is smaller than ``VF * %elementsize`` and either ``%ptrA + VF * %elementSize``
24063+ or ``%ptrB + VF * %elementSize`` wrap.
24064+ The element of the result mask is active when loading from %ptrA then storing to
24065+ %ptrB is safe and doesn't result in a write-after-read hazard, meaning that:
24066+
24067+ * (ptrB - ptrA) <= 0 (guarantees that all lanes are loaded before any stores), or
24068+ * (ptrB - ptrA) >= elementSize * lane (guarantees that this lane is loaded
24069+ before the store to the same address)
24070+
24071+ Examples:
24072+ """""""""
24073+
24074+ .. code-block:: llvm
24075+
24076+ %loop.dependence.mask = call <4 x i1> @llvm.loop.dependence.war.mask.v4i1(ptr %ptrA, ptr %ptrB, i64 4)
24077+ %vecA = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(ptr %ptrA, i32 4, <4 x i1> %loop.dependence.mask, <4 x i32> poison)
24078+ [...]
24079+ call @llvm.masked.store.v4i32.p0v4i32(<4 x i32> %vecA, ptr %ptrB, i32 4, <4 x i1> %loop.dependence.mask)
24080+
24081+ .. _int_loop_dependence_raw_mask:
24082+
24083+ '``llvm.loop.dependence.raw.mask.*``' Intrinsics
24084+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24085+
24086+ Syntax:
24087+ """""""
24088+ This is an overloaded intrinsic.
24089+
24090+ ::
24091+
24092+ declare <4 x i1> @llvm.loop.dependence.raw.mask.v4i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24093+ declare <8 x i1> @llvm.loop.dependence.raw.mask.v8i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24094+ declare <16 x i1> @llvm.loop.dependence.raw.mask.v16i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24095+ declare <vscale x 16 x i1> @llvm.loop.dependence.raw.mask.nxv16i1(ptr %ptrA, ptr %ptrB, i64 immarg %elementSize)
24096+
24097+
24098+ Overview:
24099+ """""""""
24100+
24101+ Given a vector store to %ptrA followed by a vector load from %ptrB, this
24102+ instruction generates a mask where an active lane indicates that the
24103+ read-after-write sequence can be performed safely for that lane, without a
24104+ read-after-write hazard or a store-to-load forwarding hazard being introduced.
24105+
24106+ A read-after-write hazard occurs when a read-after-write sequence for a given
24107+ lane in a vector ends up being executed as a write-after-read sequence due to
24108+ the aliasing of pointers.
24109+
24110+ A store-to-load forwarding hazard occurs when a vector store writes to an
24111+ address that partially overlaps with the address of a subsequent vector load,
24112+ meaning that the vector load can't be performed until the vector store is
24113+ complete.
24114+
24115+ Arguments:
24116+ """"""""""
24117+
24118+ The first two arguments are pointers and the last argument is an immediate.
24119+ The result is a vector with the i1 element type.
24120+
24121+ Semantics:
24122+ """"""""""
24123+
24124+ ``%elementSize`` is the size of the accessed elements in bytes.
24125+ The intrinsic returns ``poison`` if the distance between ``%prtA`` and ``%ptrB``
24126+ is smaller than ``VF * %elementsize`` and either ``%ptrA + VF * %elementSize``
24127+ or ``%ptrB + VF * %elementSize`` wrap.
24128+ The element of the result mask is active when storing to %ptrA then loading from
24129+ %ptrB is safe and doesn't result in aliasing, meaning that:
24130+
24131+ * abs(ptrB - ptrA) >= elementSize * lane (guarantees that the store of this lane
24132+ occurs before loading from this address), or
24133+ * ptrA == ptrB (doesn't introduce any new hazards that weren't in the scalar
24134+ code)
24135+
24136+ Examples:
24137+ """""""""
24138+
24139+ .. code-block:: llvm
24140+
24141+ %loop.dependence.mask = call <4 x i1> @llvm.loop.dependence.raw.mask.v4i1(ptr %ptrA, ptr %ptrB, i64 4)
24142+ call @llvm.masked.store.v4i32.p0v4i32(<4 x i32> %vecA, ptr %ptrA, i32 4, <4 x i1> %loop.dependence.mask)
24143+ [...]
24144+ %vecB = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(ptr %ptrB, i32 4, <4 x i1> %loop.dependence.mask, <4 x i32> poison)
24145+
2402224146.. _int_experimental_vp_splice:
2402324147
2402424148'``llvm.experimental.vp.splice``' Intrinsic
0 commit comments