Commit 1fba3c9
committed
[SYCL][ext] Define and Implement sycl_ext_tensor_map
This is a fairly mechanical implementation of the basic infrastructure
required to access CUDA TMA descriptors from within SYCL kernels, while
initializing them on the host. The new feature exposes two new classes
and associated support structure in
`sycl::ext::codeplay::experimental::cuda`.
There's some ugliness involved to make this work on account of the way
NVIDIA implemented this basic feature, but it's all in the name of
{legitimate-field-of-endeavour}.1 parent a1355e8 commit 1fba3c9
File tree
11 files changed
+740
-3
lines changed- llvm/include/llvm/SYCLLowerIR
- sycl
- doc/extensions/experimental
- include/sycl
- ext/codeplay/experimental
- info
- source
- detail
- test-e2e/Basic
- test
- abi
- extensions/cuda_tensor_map
11 files changed
+740
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
| 90 | + | |
89 | 91 | | |
90 | 92 | | |
91 | 93 | | |
| |||
150 | 152 | | |
151 | 153 | | |
152 | 154 | | |
153 | | - | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
154 | 158 | | |
155 | 159 | | |
156 | 160 | | |
| |||
265 | 269 | | |
266 | 270 | | |
267 | 271 | | |
268 | | - | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
269 | 279 | | |
270 | | - | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
271 | 287 | | |
272 | 288 | | |
273 | 289 | | |
| |||
0 commit comments