Skip to content

Commit a024380

Browse files
authored
[SYCL] Add esimd device descriptor for 2d load/store/prefetch (#15905)
Add esimd device descriptor to check if 2d block operations are supported by the device. UR counterpart: oneapi-src/unified-runtime#2261
1 parent 3053147 commit a024380

35 files changed

+176
-36
lines changed

sycl/cmake/modules/FetchUnifiedRuntime.cmake

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ if(SYCL_UR_USE_FETCH_CONTENT)
116116
CACHE PATH "Path to external '${name}' adapter source dir" FORCE)
117117
endfunction()
118118

119-
set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git")
119+
set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime")
120120
include(${CMAKE_CURRENT_SOURCE_DIR}/cmake/modules/UnifiedRuntimeTag.cmake)
121121

122122
set(UMF_BUILD_EXAMPLES OFF CACHE INTERNAL "EXAMPLES")
Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
# commit 0a90db9b2c36960c9b28ce18557ca15760724c4d
2-
# Merge: c4d9fdb4 6e0bdeb9
1+
# commit db83117e830406b0d9950e24892dba868acba354
2+
# Merge: 0a90db9b c79df596
33
# Author: Callum Fare <[email protected]>
4-
# Date: Wed Nov 27 12:16:44 2024 +0000
5-
# Merge pull request #2369 from Bensuo/ben/kernel-binary-update-l0
6-
# [CMDBUF] Implement kernel binary update for L0 adapter
7-
set(UNIFIED_RUNTIME_TAG 0a90db9b2c36960c9b28ce18557ca15760724c4d)
4+
# Date: Wed Nov 27 16:04:19 2024 +0000
5+
# Merge pull request #2261 from againull/againull/2d_block_exp
6+
# Add new device descriptor to query 2D block array capabilities of the Intel GPU
7+
set(UNIFIED_RUNTIME_TAG db83117e830406b0d9950e24892dba868acba354)

sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
- [__regcall Calling convention](#__regcall-calling-convention)
3232
- [Inline assembly](#inline-assembly)
3333
- [Device aspect](#device-aspect)
34+
- [Device Information Descriptors](#device-information-descriptors)
3435
- [Device queries and conditional dispatching of the code](#device-queries-and-conditional-dispatching-of-the-code)
3536
- [Implementation restrictions](#implementation-restrictions)
3637
- [Features not supported with the ESIMD extension](#features-not-supported-with-the-esimd-extension)
@@ -1018,6 +1019,11 @@ The new aspect has the following behavior when queried via `device::has()`:
10181019
|--------|-------------|
10191020
|`aspect::ext_intel_esimd` | Indicates that the device supports the `sycl_ext_intel_esimd` extension as defined in this document. |
10201021

1022+
## Device Information Descriptors
1023+
| Device Descriptors | Return Type | Description |
1024+
| ------------------ | ----------- | ----------- |
1025+
| `ext::intel::esimd::info::device::has_2d_block_io_support` | bool | Returns a boolean indicating whether 2D load/store/prefetch instructions are supported by the device. |
1026+
10211027
## Examples
10221028
### Vector addition (USM)
10231029
```cpp

sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -552,7 +552,7 @@ Loads and returns a vector `simd<T, N>` where `N` is `BlockWidth * BlockHeight *
552552
`props` - The optional compile-time properties. Only cache hint properties are used.
553553
554554
### Restrictions
555-
* This function is available only for Intel® Data Center GPU Max Series (aka PVC).
555+
* This function is available only for devices with the `ext::intel::esimd::info::device::has_2d_block_io_support` information descriptor value equal to `true`.
556556
* `Cache-hint` properties, if passed, must follow the [rules](#valid-combinations-of-l1-and-l2-cache-hints-for-load-functions) for `load` functions.
557557
* `Transformed` and `Transposed` cannot be set to true at the same time.
558558
* `BlockWidth` * `BlockHeight` * `NBlocks` * sizeof(`T`) must not exceed 2048.
@@ -598,7 +598,7 @@ Prefetches elements from a memory block of the size `BlockWidth * BlockHeight *
598598
`props` - The compile-time properties, which must specify cache-hints.
599599

600600
### Restrictions
601-
* This function is available only for Intel® Data Center GPU Max Series (aka PVC).
601+
* This function is available only for devices with the `ext::intel::esimd::info::device::has_2d_block_io_support` information descriptor value equal to `true`.
602602
* `Cache-hint` properties must follow the [rules](#valid-combinations-of-l1-and-l2-cache-hints-for-prefetch-functions) for `prefetch` functions.
603603
* `BlockWidth` * `BlockHeight` * `NBlocks` * sizeof(`T`) must not exceed 2048.
604604
* `NBlocks` must be {1,2,4} for `bytes` and `words`, {1,2} for `dwords`, 1 for `qwords`.
@@ -630,7 +630,7 @@ Stores the vector `Vals` of the type `simd<T, N>` to 2D memory block where `N` i
630630
`props` - The optional compile-time properties. Only cache hint properties are used.
631631
632632
### Restrictions
633-
* This function is available only for Intel® Data Center GPU Max Series (aka PVC).
633+
* This function is available only for devices with the `ext::intel::esimd::info::device::has_2d_block_io_support` information descriptor value equal to `true`.
634634
* `Cache-hint` properties, if passed, must follow the [rules](#valid-combinations-of-l1-and-l2-cache-hints-for-store-functions) for `store` functions.
635635
* `BlockWidth` * `BlockHeight` * sizeof(`T`) must not exceed 512.
636636
* `BlockHeight` must not exceed 8.

sycl/include/sycl/info/ext_intel_device_traits.def

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ __SYCL_PARAM_TRAITS_SPEC(ext::intel, device, free_memory, uint64_t, UR_DEVICE_IN
1616
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, memory_clock_rate, uint32_t, UR_DEVICE_INFO_MEMORY_CLOCK_RATE)
1717
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, memory_bus_width, uint32_t, UR_DEVICE_INFO_MEMORY_BUS_WIDTH)
1818
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, max_compute_queue_indices, int32_t, UR_DEVICE_INFO_MAX_COMPUTE_QUEUE_INDICES)
19+
__SYCL_PARAM_TRAITS_SPEC(ext::intel::esimd, device, has_2d_block_io_support, bool, UR_DEVICE_INFO_2D_BLOCK_ARRAY_CAPABILITIES_EXP)
1920
#ifdef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF
2021
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC
2122
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF

sycl/source/detail/device_info.hpp

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1589,6 +1589,25 @@ get_device_info<ext::intel::info::device::memory_bus_width>(
15891589
return get_device_info_impl<Param::return_type, Param>::get(Dev);
15901590
}
15911591

1592+
template <>
1593+
inline ext::intel::esimd::info::device::has_2d_block_io_support::return_type
1594+
get_device_info<ext::intel::esimd::info::device::has_2d_block_io_support>(
1595+
const DeviceImplPtr &Dev) {
1596+
if (!Dev->has(aspect::ext_intel_esimd))
1597+
return false;
1598+
1599+
ur_exp_device_2d_block_array_capability_flags_t BlockArrayCapabilities;
1600+
Dev->getAdapter()->call<UrApiKind::urDeviceGetInfo>(
1601+
Dev->getHandleRef(),
1602+
UrInfoCode<
1603+
ext::intel::esimd::info::device::has_2d_block_io_support>::value,
1604+
sizeof(BlockArrayCapabilities), &BlockArrayCapabilities, nullptr);
1605+
return (BlockArrayCapabilities &
1606+
UR_EXP_DEVICE_2D_BLOCK_ARRAY_CAPABILITY_FLAG_LOAD) &&
1607+
(BlockArrayCapabilities &
1608+
UR_EXP_DEVICE_2D_BLOCK_ARRAY_CAPABILITY_FLAG_STORE);
1609+
}
1610+
15921611
// Returns the list of all progress guarantees that can be requested for
15931612
// work_groups from the coordination level of root_group when using the device
15941613
// given by Dev. First it calls getProgressGuarantee to get the strongest

sycl/test-e2e/ESIMD/InlineAsm/asm_glb.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,8 @@ int main(void) {
4343
queue q(esimd_test::ESIMDSelector, esimd_test::createExceptionHandler());
4444

4545
auto dev = q.get_device();
46-
std::cout << "Running on " << dev.get_info<info::device::name>() << "\n";
46+
std::cout << "Running on " << dev.get_info<sycl::info::device::name>()
47+
<< "\n";
4748

4849
auto e = q.submit([&](handler &cgh) {
4950
auto PA = bufa.get_access<access::mode::read>(cgh);

sycl/test-e2e/ESIMD/addc.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ template <int N, bool AIsVector, bool BIsVector> bool test(sycl::queue Q) {
138138
int main() {
139139
queue Q(esimd_test::ESIMDSelector, esimd_test::createExceptionHandler());
140140
auto D = Q.get_device();
141-
std::cout << "Running on " << D.get_info<info::device::name>() << "\n";
141+
std::cout << "Running on " << D.get_info<sycl::info::device::name>() << "\n";
142142

143143
constexpr bool AIsVector = true;
144144
constexpr bool BIsVector = true;

sycl/test-e2e/ESIMD/api/functional/operators/operator_assignment_glb.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,8 @@ int main(void) {
4646
queue q(esimd_test::ESIMDSelector, esimd_test::createExceptionHandler());
4747

4848
auto dev = q.get_device();
49-
std::cout << "Running on " << dev.get_info<info::device::name>() << "\n";
49+
std::cout << "Running on " << dev.get_info<sycl::info::device::name>()
50+
<< "\n";
5051

5152
auto e = q.submit([&](handler &cgh) {
5253
auto PA = bufa.get_access<access::mode::read>(cgh);

sycl/test-e2e/ESIMD/api/functional/operators/operator_assignment_glb_mask.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,8 @@ int main(void) {
4444
queue q(esimd_test::ESIMDSelector, esimd_test::createExceptionHandler());
4545

4646
auto dev = q.get_device();
47-
std::cout << "Running on " << dev.get_info<info::device::name>() << "\n";
47+
std::cout << "Running on " << dev.get_info<sycl::info::device::name>()
48+
<< "\n";
4849

4950
auto e = q.submit([&](handler &cgh) {
5051
auto PA = bufa.get_access<access::mode::read>(cgh);

0 commit comments

Comments
 (0)