Skip to content

Commit 0de8e83

Browse files
[SYCL][UR][Bindless][Doc] Fix copy docs and implementation. (#19093)
This patch fixes the implementation of bindless image copies. Previously, source and destination pitch values were not being set correctly. This patch also updates the wording around the requirements for `ext_oneapi_copy`. A missing requirement was added to the specification. Namely that the `CopyExtent` parameter in the `ext_oneapi_copy` functions that take it, must not have `0` values in any of the three dimensions, they must be greater than or equal to `1`. The requirements for `ext_oneapi_copy` have also been re-written to prescribe what the functions expect, instead of providing a list of cases in which the function may fail. This should hopefully make it clearer and more prescriptive, rather than saying the copy function may fail if some condition is not met, we now say that the functions require that certain conditions be met. The coverage for sub-region copy testing has also been extended to prevent future regressions.
1 parent 8f54710 commit 0de8e83

File tree

8 files changed

+514
-179
lines changed

8 files changed

+514
-179
lines changed

sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc

Lines changed: 43 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1335,51 +1335,63 @@ For the forms that take a USM pointer, the image memory must also have been
13351335
allocated within the same context and device of the `queue`. The USM memory
13361336
must be accessible on the queue's device.
13371337

1338-
The `ext_oneapi_copy` function variants that don't take offsets and extents may
1339-
fail in the following scenarios:
1338+
The `ext_oneapi_copy` function variants that do not take offsets and extents
1339+
must ensure that the following conditions are met:
13401340

1341-
1. The `Src` and `Dest` memory was not allocated on the same device and
1342-
context of the queue.
1341+
1. The `Src` and `Dest` memory was allocated on the same device and context.
13431342

1344-
2. The `Src` and `Dest` memory regions, where `Src` or `Dest` can be either
1345-
on the host or device, do not have the same memory capacity, where the capacity
1346-
is calculate from the `width`, `height`, `depth`, `channel_order`, and
1343+
2. The `Src` and `Dest` memory regions, where `Src` or `Dest` can be either
1344+
on the host or device, have the same memory capacity, where the capacity
1345+
is calculated from the `width`, `height`, `depth`, `channel_order`, and
13471346
`channel_type` members of the `image_descriptor` parameter.
13481347

1349-
The `ext_oneapi_copy` function variants that do take offsets and extents may
1350-
fail in the following scenarios:
1348+
The `ext_oneapi_copy` function variants that do take offsets and extents must
1349+
ensure that the following conditions are met. If a condition names a specific
1350+
parameter, it is only applicable to the function variants that take that
1351+
parameter.
13511352

1352-
1. The `Src` and `Dest` memory was not allocated on the same device and
1353-
context of the queue.
1353+
1. The `Src` and `Dest` memory was allocated on the same device and context.
13541354

1355-
2. The image descriptor passed does not match the image descriptor used to
1356-
allocate the image on the device.
1355+
2. The image descriptors passed match the image descriptors used to allocate
1356+
the image's memory on the device.
13571357

1358-
3. the `CopyExtent` describes a memory region larger than that which was
1359-
allocated on either the host or the device.
1358+
3. The `CopyExtent` describes a memory region that is not larger than that which
1359+
was allocated on either the host or the device.
13601360

1361-
4. The `HostExtent` describes a memory region larger than that which was
1362-
allocated on the host.
1361+
4. The `HostExtent` describes a memory region that is not larger than that which
1362+
was allocated on the host.
13631363

1364-
5. The `SrcExtent` describes a memory region larger than that which was
1365-
allocated, where `Src` can be either the host or device.
1364+
5. The `SrcExtent` describes a memory region that is not larger than that which
1365+
was allocated, where `Src` can be either on the host or on the device.
13661366

1367-
6. The `DestExtent` describes a memory region larger than that which was
1368-
allocated, where `Dest` can be either the host or device.
1367+
6. The `DestExtent` describes a memory region that is not larger than that which
1368+
was allocated, where `Dest` can be either on the host or on the device.
13691369

1370-
7. If `SrcOffset + CopyExtent` moves the memory sub-region outside the bounds
1371-
of the memory described by `Src`, irrespective of whether `Src` is on the host
1372-
or the device.
1370+
7. The `DeviceRowPitch` adheres to the alignment requirements outlined in the
1371+
"Pitch alignment restrictions and queries" section.
13731372

1374-
8. If `DestOffset + CopyExtent` moves the memory sub-region outside the bounds
1375-
of the memory described by `Dest`, irrespective of whether `Dest` is on the
1376-
host or the device.
1373+
8. The `DeviceRowPitch` is greater than or equal to the width of the image on
1374+
the device.
13771375

1378-
9. The `DeviceRowPitch` does not adhere to the alignment requirements
1379-
outlined in section "Pitch alignment restrictions and queries"
1376+
9. For the relevant dimensions, `SrcOffset + CopyExtent` does not move the
1377+
memory sub-region outside the bounds of the memory described by `Src`,
1378+
irrespective of whether `Src` is on the host or the device. The relevant
1379+
dimensions are `x` for 1D images; `x` and `y` for 2D images; and `x`, `y`, and
1380+
`z` for 3D images. `x`, `y`, and `z` correspond to indices `0`, `1`, and `2` of
1381+
the `SrcOffset` and `CopyExtent` parameters, respectively.
1382+
1383+
10. For the relevant dimensions, `DestOffset + CopyExtent` does not move the
1384+
memory sub-region outside the bounds of the memory described by `Dest`,
1385+
irrespective of whether `Dest` is on the host or the device. The relevant
1386+
dimensions are `x` for 1D images, `x` and `y` for 2D images, and `x`, `y`, and
1387+
`z` for 3D images. `x`, `y`, and `z` correspond to indices `0`, `1`, and `2` of
1388+
the `SrcOffset` and `CopyExtent` parameters, respectively.
1389+
1390+
11. The `CopyExtent`'s' `x`, `y`, and `z` dimensions must not be `0`. They must
1391+
be greater than or equal to `1`. Even if the image is 1D or 2D, the remaining
1392+
non-relevant dimension's values must be set to `1` in the `CopyExtent`
1393+
parameter.
13801394

1381-
10. The value of `DeviceRowPitch` is smaller than the width of the image on
1382-
the device.
13831395

13841396
If copying of an image fails, `ext_oneapi_copy` will throw a `sycl::exception`
13851397
with error code `sycl::errc::invalid`, and relay an error message back to the

sycl/source/handler.cpp

Lines changed: 59 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,33 @@ void *getValueFromDynamicParameter(
9898

9999
// Bindless image helpers
100100

101+
constexpr size_t get_channel_size(
102+
const sycl::ext::oneapi::experimental::image_descriptor &Desc) {
103+
switch (Desc.channel_type) {
104+
case sycl::image_channel_type::fp16:
105+
return sizeof(sycl::half);
106+
case sycl::image_channel_type::fp32:
107+
return sizeof(float);
108+
case sycl::image_channel_type::snorm_int8:
109+
case sycl::image_channel_type::unorm_int8:
110+
case sycl::image_channel_type::signed_int8:
111+
case sycl::image_channel_type::unsigned_int8:
112+
return sizeof(uint8_t);
113+
case sycl::image_channel_type::snorm_int16:
114+
case sycl::image_channel_type::unorm_int16:
115+
case sycl::image_channel_type::signed_int16:
116+
case sycl::image_channel_type::unsigned_int16:
117+
return sizeof(uint16_t);
118+
case sycl::image_channel_type::signed_int32:
119+
case sycl::image_channel_type::unsigned_int32:
120+
return sizeof(uint32_t);
121+
default:
122+
throw sycl::exception(make_error_code(errc::invalid),
123+
"Unsupported channel type");
124+
return 0;
125+
}
126+
}
127+
101128
// Fill image type and return depth or array_size
102129
static unsigned int
103130
fill_image_type(const ext::oneapi::experimental::image_descriptor &Desc,
@@ -257,16 +284,8 @@ fill_copy_args(detail::handler_impl *impl,
257284
impl->MDstImageDesc.depth = DestExtent[2];
258285
}
259286

260-
if (impl->MImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_HOST_TO_DEVICE) {
261-
impl->MSrcImageDesc.rowPitch = 0;
262-
impl->MDstImageDesc.rowPitch = DestPitch;
263-
} else if (impl->MImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_DEVICE_TO_HOST) {
264-
impl->MSrcImageDesc.rowPitch = SrcPitch;
265-
impl->MDstImageDesc.rowPitch = 0;
266-
} else {
267-
impl->MSrcImageDesc.rowPitch = SrcPitch;
268-
impl->MDstImageDesc.rowPitch = DestPitch;
269-
}
287+
impl->MSrcImageDesc.rowPitch = SrcPitch;
288+
impl->MDstImageDesc.rowPitch = DestPitch;
270289
}
271290

272291
static void
@@ -279,9 +298,11 @@ fill_copy_args(detail::handler_impl *impl,
279298
sycl::range<3> DestExtent = {0, 0, 0},
280299
sycl::range<3> CopyExtent = {0, 0, 0}) {
281300

282-
fill_copy_args(impl, Desc, Desc, ImageCopyFlags, 0 /*SrcPitch*/,
283-
0 /*DestPitch*/, SrcOffset, SrcExtent, DestOffset, DestExtent,
284-
CopyExtent);
301+
size_t SrcPitch = SrcExtent[0] * Desc.num_channels * get_channel_size(Desc);
302+
size_t DestPitch = DestExtent[0] * Desc.num_channels * get_channel_size(Desc);
303+
304+
fill_copy_args(impl, Desc, Desc, ImageCopyFlags, SrcPitch, DestPitch,
305+
SrcOffset, SrcExtent, DestOffset, DestExtent, CopyExtent);
285306
}
286307

287308
static void
@@ -309,8 +330,13 @@ fill_copy_args(detail::handler_impl *impl,
309330
sycl::range<3> DestExtent = {0, 0, 0},
310331
sycl::range<3> CopyExtent = {0, 0, 0}) {
311332

312-
fill_copy_args(impl, SrcImgDesc, DestImgDesc, ImageCopyFlags, 0 /*SrcPitch*/,
313-
0 /*DestPitch*/, SrcOffset, SrcExtent, DestOffset, DestExtent,
333+
size_t SrcPitch =
334+
SrcExtent[0] * SrcImgDesc.num_channels * get_channel_size(SrcImgDesc);
335+
size_t DestPitch =
336+
DestExtent[0] * DestImgDesc.num_channels * get_channel_size(DestImgDesc);
337+
338+
fill_copy_args(impl, SrcImgDesc, DestImgDesc, ImageCopyFlags, SrcPitch,
339+
DestPitch, SrcOffset, SrcExtent, DestOffset, DestExtent,
314340
CopyExtent);
315341
}
316342

@@ -1618,10 +1644,17 @@ void handler::ext_oneapi_copy(
16181644
get_pointer_type(Dest,
16191645
createSyclObjFromImpl<context>(impl->get_context())));
16201646

1621-
if (ImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_HOST_TO_DEVICE ||
1622-
ImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_DEVICE_TO_HOST) {
1623-
detail::fill_copy_args(get_impl(), Desc, ImageCopyFlags, DeviceRowPitch,
1647+
// Calculate host pitch, where host memory is always assumed to be tightly
1648+
// packed.
1649+
size_t HostRowPitch =
1650+
Desc.width * Desc.num_channels * detail::get_channel_size(Desc);
1651+
1652+
if (ImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_HOST_TO_DEVICE) {
1653+
detail::fill_copy_args(get_impl(), Desc, ImageCopyFlags, HostRowPitch,
16241654
DeviceRowPitch);
1655+
} else if (ImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_DEVICE_TO_HOST) {
1656+
detail::fill_copy_args(get_impl(), Desc, ImageCopyFlags, DeviceRowPitch,
1657+
HostRowPitch);
16251658
} else {
16261659
throw sycl::exception(make_error_code(errc::invalid),
16271660
"Copy Error: This copy function only performs host "
@@ -1650,14 +1683,19 @@ void handler::ext_oneapi_copy(
16501683
get_pointer_type(Dest,
16511684
createSyclObjFromImpl<context>(impl->get_context())));
16521685

1686+
// Calculate host pitch, where host memory is always assumed to be tightly
1687+
// packed.
1688+
size_t HostRowPitch = HostExtent[0] * DeviceImgDesc.num_channels *
1689+
detail::get_channel_size(DeviceImgDesc);
1690+
16531691
// Fill the host extent based on the type of copy.
16541692
if (ImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_HOST_TO_DEVICE) {
16551693
detail::fill_copy_args(get_impl(), DeviceImgDesc, ImageCopyFlags,
1656-
DeviceRowPitch, DeviceRowPitch, SrcOffset,
1657-
HostExtent, DestOffset, {0, 0, 0}, CopyExtent);
1694+
HostRowPitch, DeviceRowPitch, SrcOffset, HostExtent,
1695+
DestOffset, {0, 0, 0}, CopyExtent);
16581696
} else if (ImageCopyFlags == UR_EXP_IMAGE_COPY_FLAG_DEVICE_TO_HOST) {
16591697
detail::fill_copy_args(get_impl(), DeviceImgDesc, ImageCopyFlags,
1660-
DeviceRowPitch, DeviceRowPitch, SrcOffset, {0, 0, 0},
1698+
DeviceRowPitch, HostRowPitch, SrcOffset, {0, 0, 0},
16611699
DestOffset, HostExtent, CopyExtent);
16621700
} else {
16631701
throw sycl::exception(make_error_code(errc::invalid),

0 commit comments

Comments
 (0)