Skip to content

Commit 76128ce

Browse files
Merge from suggestion llvm/docs/NVPTXUsage.rst
Co-authored-by: gonzalobg <[email protected]>
1 parent 40bf9da commit 76128ce

File tree

1 file changed

+17
-18
lines changed

1 file changed

+17
-18
lines changed

llvm/docs/NVPTXUsage.rst

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -685,25 +685,24 @@ Syntax:
685685
Overview:
686686
"""""""""
687687

688-
The '``@llvm.nvvm.discard.*``' semantically behaves like a weak write of an *unstable indeterminate value*:
689-
reads of memory locations with *unstable indeterminate values* may return different
690-
bit patterns each time until the memory is overwritten.
691-
This operation *hints* to the implementation that data in the specified cache ``.level``
692-
can be destructively discarded without writing it back to memory. The operand ``size`` is an
693-
integer constant that specifies the length in bytes of the address range ``[a, a + size)`` to write
694-
*unstable indeterminate values* into. The only supported value for the ``size`` operand is ``128``.
695-
If no state space is specified then `generic-addressing` is used. If the specified address does
696-
not fall within the address window of ``.global`` state space then the behavior is undefined.
697-
698-
LLVM does not define anywhere what an *unstable indeterminate values* is, and the closest concept
699-
LLVM has breaks the example below:
688+
The *effects* of the ``@llvm.nvvm.discard.L2*`` intrinsics are those of a non-atomic non-volatile ``llvm.memset`` that writes ``undef`` to the destination address range ``[%ptr, %ptr + immarg)``.
689+
Subsequent reads from the address range may read ``undef`` until the memory is overwritten with a different value.
690+
These operations *hint* the implementation that data in the L2 cache can be destructively discarded without writing it back to memory.
691+
The operand ``immarg`` is an integer constant that specifies the length in bytes of the address range ``[%ptr, %ptr + immarg)`` to write ``undef`` into.
692+
The only supported value for the ``immarg`` operand is ``128``.
693+
If generic addressing is used and the specified address does not fall within the address window of global memory (``addrspace(1)``) the behavior is undefined.
700694

701-
.. code-block:: text
702-
703-
discard.global.L2 [ptr], 128;
704-
ld.weak.u32 r0, [ptr];
705-
ld.weak.u32 r1, [ptr];
706-
// The values in r0 and r1 may differ!
695+
.. code-block:: llvm
696+
697+
call void @llvm.nvvm.discard.L2(ptr %p, i64 128) ;; writes `undef` to [p, p+128)
698+
%a = load i64, ptr %p. ;; loads 8 bytes containing undef
699+
%b = load i64, ptr %p ;; loads 8 bytes containing undef
700+
;; comparing %a and %b compares `undef` values!
701+
%fa = freeze i64 %a ;; freezes undef to stable bit-pattern
702+
%fb = freeze i64 %b ;; freezes undef to stable bit-pattern
703+
;; %fa may compare different to %fb!
704+
705+
For more information, refer to the `CUDA C++ discard documentation <https://nvidia.github.io/cccl/libcudacxx/extended_api/memory_access_properties/discard_memory.html>`__ and the `PTX ISA discard documentation <https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-discard>`__ .
707706

708707
For more information, refer to the PTX ISA
709708
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-discard>`_.

0 commit comments

Comments
 (0)