Skip to content

Commit 68540fd

Browse files
committed
refine docs
1 parent 8b8c135 commit 68540fd

File tree

1 file changed

+16
-8
lines changed

1 file changed

+16
-8
lines changed

llvm/docs/NVPTXUsage.rst

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -685,14 +685,22 @@ Syntax:
685685
Overview:
686686
"""""""""
687687

688-
The '``@llvm.nvvm.discard.*``' invalidates the data at the address range [a .. a + (size - 1)]
689-
abhilash1910 marked this conversation as resolved.
690-
in the cache level specified by the .level qualifier without writing back the data
691-
in the cache to the memory. The operand size is an integer constant that specifies the amount of data,
692-
in bytes, in the cache level specified by the .level qualifier to be discarded. The only supported value
693-
for the size operand is 128. If no state space is specified then Generic Addressing is used.
694-
If the specified address does not fall within the address window of .global state space then
695-
the behavior is undefined.
688+
The '``@llvm.nvvm.discard.*``' semantically behaves like a weak write of an *unstable indeterminate value*:
689+
reads of memory locations with *unstable indeterminate values* may return different
690+
bit patterns each time until the memory is overwritten.
691+
This operation *hints* to the implementation that data in the specified cache ``.level``
692+
can be destructively discarded without writing it back to memory. The operand ``size`` is an
693+
integer constant that specifies the length in bytes of the address range ``[a, a + size)`` to write
694+
*unstable indeterminate values* into. The only supported value for the ``size`` operand is ``128``.
695+
If no state space is specified then `generic-addressing` is used. If the specified address does
696+
not fall within the address window of ``.global`` state space then the behavior is undefined.
697+
698+
.. code-block:: text
699+
700+
discard.global.L2 [ptr], 128;
701+
ld.weak.u32 r0, [ptr];
702+
ld.weak.u32 r1, [ptr];
703+
// The values in r0 and r1 may differ!
696704
697705
For more information, refer to the PTX ISA
698706
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-discard>`_.

0 commit comments

Comments
 (0)