Skip to content

Commit bfb2bba

Browse files
committed
Address feedback
Signed-off-by: Julian Oppermann <[email protected]>
1 parent 8c91e7d commit bfb2bba

File tree

1 file changed

+11
-6
lines changed

1 file changed

+11
-6
lines changed

sycl/doc/design/SYCL-RTC.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,11 @@ functionality, such as an extended set of math functions and support for
254254
`bfloat16` arithmetic, and are available as Bitcode files inside the DPC++
255255
installation or the vendor toolchain, so we just use LLVM utilities to load them
256256
into memory and link them to the module representing the runtime-compiled
257-
kernels.
257+
kernels. The main challenge here is that the logic to select the device
258+
libraries is currently not reusable from its implementation in the driver, so
259+
our implementation is a simplified copy of the
260+
[`SYCL::getDeviceLibraries(...)`](https://github.com/intel/llvm/blob/cc966df07d29db75d07f969f044c0491819bd930/clang/lib/Driver/ToolChains/SYCL.cpp#L553)
261+
method, which needs to be kept in sync with the driver code.
258262
259263
For the SYCL-specific post-processing, implemented in
260264
[`jit_compiler::performPostLink(...)`](https://github.com/intel/llvm/blob/cc966df07d29db75d07f969f044c0491819bd930/sycl-jit/jit-compiler/lib/rtc/DeviceCompilation.cpp#L750),
@@ -263,7 +267,12 @@ we can reuse modular analysis and transformation passes in the
263267
component. The main tasks for the post-processing passes is to split the device
264268
code module into smaller units (either as requested by the user, or required by
265269
the ESIMD mode), and to compute the properties that need to be passed to the
266-
SYCL runtime when the device images are loaded.
270+
SYCL runtime when the device images are loaded. The logic to orchestrate the
271+
`SYCLLowerIR` passes is adapted from the `sycl-post-link` tool's
272+
[`processInputModule(...)`](https://github.com/intel/llvm/blob/cc966df07d29db75d07f969f044c0491819bd930/llvm/tools/sycl-post-link/sycl-post-link.cpp#L606)
273+
function. This duplicated code should be removed as well once a suitable
274+
reusable implementation becomes available.
275+
267276
268277
## Translation to the target format
269278
@@ -298,10 +307,6 @@ A list of values that can be set as the target CPU can be found in the
298307
option](https://intel.github.io/llvm/UsersManual.html#generic-options) (leave
299308
out the `amd_gpu_` and `nvidia_gpu_` prefixes).
300309
301-
At the moment, the support is available in [daily
302-
builds](https://github.com/intel/llvm/releases) of the open-source version of
303-
DPC++.
304-
305310
## Further reading
306311
307312
- Technical presentation at IWOCL 2025: *Fast In-Memory Runtime Compilation of

0 commit comments

Comments
 (0)