Skip to content

Commit 18137d4

Browse files
committed
Move docs to correct location.
Signed-off-by: JackAKirk <[email protected]>
1 parent 0d2064a commit 18137d4

File tree

1 file changed

+36
-36
lines changed

1 file changed

+36
-36
lines changed

sycl/doc/syclcompat/README.md

Lines changed: 36 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -213,42 +213,6 @@ These translate any kernel dimensions from one convention to the other. An
213213
example of an equivalent SYCL call for a 3D kernel using `compat` is
214214
`syclcompat::global_id::x() == get_global_id(2)`.
215215

216-
### ptr_to_int
217-
218-
The following cuda backend specific function is introduced in order
219-
to translate from the local memory pointers introduced above to `uint32_t` or
220-
`size_t` variables that contain a byte address to the local
221-
(local refers to`.shared` in nvptx) memory state space.
222-
223-
``` c++
224-
namespace syclcompat {
225-
template <typename T>
226-
__syclcompat_inline__
227-
std::enable_if_t<std::is_same_v<T, uint32_t> || std::is_same_v<T, size_t>,
228-
T>
229-
ptr_to_int(void *ptr)
230-
} // syclcompat
231-
```
232-
233-
These variables can be used in inline PTX instructions that take address
234-
operands. Such inline PTX instructions are commonly used in optimized libraries.
235-
A simplified example usage of the above functions is as follows:
236-
237-
``` c++
238-
half *data = syclcompat::local_mem<half[NUM_ELEMENTS]>();
239-
// ...
240-
// ...
241-
T addr =
242-
syclcompat::ptr_to_int<T>(reinterpret_cast<char *>(data) + (id % 8) * 16);
243-
244-
uint32_t fragment;
245-
#if defined(__NVPTX__)
246-
asm volatile("ldmatrix.sync.aligned.m8n8.x1.shared.b16 {%0}, [%1];\n"
247-
: "=r"(fragment)
248-
: "r"(addr));
249-
#endif
250-
```
251-
252216
### launch<function>
253217

254218
SYCLcompat provides a kernel `launch` interface which accepts a function that
@@ -983,6 +947,42 @@ public:
983947
};
984948
```
985949

950+
### ptr_to_int
951+
952+
The following cuda backend specific function is introduced in order to
953+
translate from local memory pointers to `uint32_t` or `size_t` variables that
954+
contain a byte address to the local (local refers to`.shared` in nvptx) memory
955+
state space.
956+
957+
``` c++
958+
namespace syclcompat {
959+
template <typename T>
960+
__syclcompat_inline__
961+
std::enable_if_t<std::is_same_v<T, uint32_t> || std::is_same_v<T, size_t>,
962+
T>
963+
ptr_to_int(void *ptr)
964+
} // syclcompat
965+
```
966+
967+
These variables can be used in inline PTX instructions that take address
968+
operands. Such inline PTX instructions are commonly used in optimized
969+
libraries. A simplified example usage of the above functions is as follows:
970+
971+
``` c++
972+
half *data = syclcompat::local_mem<half[NUM_ELEMENTS]>();
973+
// ...
974+
// ...
975+
T addr =
976+
syclcompat::ptr_to_int<T>(reinterpret_cast<char *>(data) + (id % 8) * 16);
977+
978+
uint32_t fragment;
979+
#if defined(__NVPTX__)
980+
asm volatile("ldmatrix.sync.aligned.m8n8.x1.shared.b16 {%0}, [%1];\n"
981+
: "=r"(fragment)
982+
: "r"(addr));
983+
#endif
984+
```
985+
986986
### Device Management
987987

988988
Multiple SYCL functionalities are exposed through utility functions to manage

0 commit comments

Comments
 (0)