Skip to content

Conversation

bratpiorka
Copy link
Contributor

Do not set a zeroed GlobalOffset in kernel parameters - the Unified Runtime layer will handle this correctly.
Note that there is already a similar optimization in https://github.com/intel/llvm/blob/sycl/sycl/source/detail/scheduler/commands.cpp#L2477

@bratpiorka
Copy link
Contributor Author

for "SYCL :: Basic/submit_time.cpp" fail there is an issue #20248

@intel/sycl-graphs-reviewers @intel/llvm-reviewers-runtime please review

@bratpiorka bratpiorka marked this pull request as ready for review October 2, 2025 10:12
@bratpiorka bratpiorka requested review from a team as code owners October 2, 2025 10:12
Comment on lines 2613 to 2615
const bool HasOffset = NDRDesc.GlobalOffset[0] != 0 ||
NDRDesc.GlobalOffset[1] != 0 ||
NDRDesc.GlobalOffset[2] != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be dependent on the dimensionality? I.e. say I have a 2-dimensional launch, wouldn't the offset be zero of either of the first 2 dimensions are 0? Also, in a case like that, it shouldn't matter what the 3rd dimension is, right?

What I am suggesting is basically:

Suggested change
const bool HasOffset = NDRDesc.GlobalOffset[0] != 0 ||
NDRDesc.GlobalOffset[1] != 0 ||
NDRDesc.GlobalOffset[2] != 0;
const bool HasOffset = NDRDesc.GlobalOffset[0] != 0 &&
(NDRDesc.Dims < 2 || NDRDesc.GlobalOffset[1] != 0) &&
(NDRDesc.Dims < 3 || NDRDesc.GlobalOffset[2] != 0);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right - applied in both places where "HasOffset" is calculated and additionally to the "EnforcedLocalSize" flag, which uses the same code pattern

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Adapter.call_nocheck<UrApiKind::urCommandBufferAppendKernelLaunchExp>(
CommandBuffer, UrKernel, NDRDesc.Dims, &NDRDesc.GlobalOffset[0],
CommandBuffer, UrKernel, NDRDesc.Dims,
HasOffset ? &NDRDesc.GlobalOffset[0] : nullptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification of urCommandBufferAppendKernelLaunchExp seems to suggest that passing nullptr to the global offset should return UR_RESULT_ERROR_INVALID_NULL_POINTER. Does the UR implementation just not align with the specification here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right—I've updated the specification, so now global offset is optional. This is consistent with EnqueueKernelLaunch, where this parameter is also optional.

@bratpiorka bratpiorka requested a review from mmichel11 October 6, 2025 09:09
Copy link
Contributor

@mmichel11 mmichel11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@againull againull merged commit 9bd3fbc into intel:sycl Oct 6, 2025
71 of 74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants