[SYCL][DOC] Add new extension sycl_ext_oneapi_clock #19842

KornevNikita · 2025-08-20T16:40:11Z

No description provided.

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc

Spec: intel#19842

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc

steffenlarsen

LGTM!

github-actions · 2025-09-02T18:06:00Z

@intel/llvm-gatekeepers please consider merging

gmlueck · 2025-09-02T18:44:58Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc

+between separate executions of the program, and be affected by conditions
+outside the control of the programmer. The value returned by this instruction
+corresponds to the least significant bits of the clock counter at the time of
+execution. Consequently, the sampled value may wrap around zero.


Can we provide a guarantee that the clock counter starts at zero when the kernel starts executing? If so, that might help with the wrap-around problem.

My 2 cents:

Hardware counter wrapping is a common theme on CPUs.

The prospective usecase I have in mind requires comparing counters between two kernels; resetting the device-scope counter for each kernel would break that.

64-bit nanosecond counter would need a few centuries to overflow, so wrap-around is more of a theoretical possibility. But the spec seem to assume that the counters can have less bits. A 32-bit nanosecond counter will overflow after ~4 seconds, so an overflow can occur within a kernel's lifetime.

(the above is not based on any hardware details; just trying to make some estimates)

The spec does not provide enough guarantees to do what you want:

You seem to be assuming that the clocks start at zero when the process first starts (or maybe when the process submits its first kernel). You are also assuming that the clocks are not reset between kernel submissions. However, none of this is guaranteed by the spec.

You are asserting that the clocks count in units of nanoseconds, but this is also not guaranteed by the spec.

To be clear, I don't think SYCL can provide any of these guarantees. The SYCL spec layers on top of SPV_KHR_shader_clock, which provides very few guarantees.

@bashbaug do you have any insight here? The SPIR-V spec (and the OpenCL C cl_khr_kernel_clock extension) don't seem to provide enough guarantees to do any useful timing operations. Do vendors provide some additional guarantees in some separate specification? Do applications just make assumptions about the clock that are not guaranteed by the spec, and the code just happens to work?

You seem to be assuming that the clocks start at zero when the process first starts (or maybe when the process submits its first kernel).

I made an assumption of this kind when estimating the upper limit on the wrap-around time. I don't see how this assumption is required for computing time difference (provided wrap-around is handled, but, unlike clock reset, it can be handled).

For clock_scope::device, the spec say: "clock() gets values shared by all work-items executing on the device.". To me, it reads like the clocks are not reset for the lifetime of the device object and also that two kernels executing simultaneously use the same clock.

You are asserting that the clocks count in units of nanoseconds, but this is also not guaranteed by the spec.

Yes, I pointed this out earlier in the discussion (and the fact that the units are not even guaranteed to be uniform). For the wrap-around estimation, 1 tick = 1 nanosecond was an assumption indeed, but based on the conversion constant used in some PTI-GPU samples. For the practical use, there is an issue recorded in the spec below, so hopefully it will be addressed.

For clock_scope::device, the spec say: "clock() gets values shared by all work-items executing on the device.". To me, it reads like the clocks are not reset for the lifetime of the device object and also that two kernels executing simultaneously use the same clock.

Yes, good point.

So, is the only remaining open the one about the units of the counter?

So, is the only remaining open the one about the units of the counter?

I think it can be split into two parts, where the second one can be done without the first one:

Defining the counter units: whether the ticks can be converted into seconds (and how).

Guaranteeing a "steady" clock: a weaker guarantee that the clock is at least steady (each tick represents a consistent, though unknown, duration). That would still allow, e.g., benchmarking and load balancing. See, e.g., GL_EXT_shader_realtime_clock.

EDIT: From the application perspective, having the capability in the API is nice even if some hardware don't support it. At the very least, CPUs can support steady counters with known conversion factors.

From my perspective, this comment is resolved. I've been convinced that the API is useful even though the values it returns cannot be converted to time units. I have not clicked the "Resolve conversation" button only because @al42and has made some comments here too.

Addressing your comments ... I think we cannot provide the behavior you request on GPU because the hardware just doesn't work that way. We can count cycles (not time), and the frequency may change dynamically as the kernel executes. I agree that we could do better on CPU, but I think the motivation for adding this extension is to support GPU, and I do not think we want to invest the effort right now to implement the feature just for CPU.

You're right about the current Intel GPU hardware limitations. My concern, however, is designing a future-proof API rather than one constrained by the initial target's lowest common denominator.

I believe the API, even an experimental one, should reflect the capabilities of the entire DPC++ ecosystem. We shouldn't permanently limit the API due to the constraints of the first backend to support it, when other backends have broader capabilities.

A more flexible API doesn't require implementing it for every backend immediately. It simply prevents breaking changes later when (if) we do add support for more capable hardware. Many SYCL extensions are rolled out incrementally across backends.

To be clear, this is about the API design, not a demand for an immediate implementation on all platforms :)

It's much easier to loosen constraints in the future than it is to add new constraints. If we have new GPU devices in the future that can provide more guarantees about the clock, then we can update this extension to a new revision and add a new aspect that provides those additional guarantees. Therefore, I don't think the extension's current wording will tie our hands in the future.

I agree that it's easier to loosen constraints than to add them. However, I see this as less about adding constraints and more about designing an API that accurately communicates diverse hardware capabilities from the start.

Deferring this decision creates unnecessary friction for developers:

Preventing developers from using steady clocks, even when the underlying hardware supports them, will slow down the extension adoption.

Experimental extensions are unversioned, so adding a new symbol (e.g., ext_oneapi_clock_device_steady aspect) forces developers to write complex code with build-system checks and #ifdefs to manage API differences.

A more flexible API from the beginning would avoid these issues and allow developers to fully use all the hardware supported by DPC++. And this extension already seems to accommodate functionality beyond what is supported by off-the-shelf Intel GPUs (#20131).

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc

zhimingwang36 · 2025-09-03T01:31:37Z

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc

+```c++
+namespace sycl::ext::oneap::experimental {
+
+uint64_t clock(clock_scope scope = clock_scope::sub_group);


is it possible to use the clock() to implement the usleep(usecond) function in kernel? or just provide a device function usleep()? Thanks.

There is no way to convert these clocks to seconds, so unfortunately the answer is no.

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc

updated the overview added new aspects fix some typos

github-actions · 2025-09-05T18:06:20Z

@intel/llvm-gatekeepers please consider merging

KornevNikita · 2025-09-05T19:45:32Z

@intel/llvm-gatekeepers yep, I believe this one is ready. Please merge

aelovikov-intel

PR description isn't suitable for upstream PR.

KornevNikita · 2025-09-05T19:50:51Z

@aelovikov-intel updated

Description has been updated.

[SYCL][DOC] Add new extension sycl_ext_oneapi_clock

a5f0e9d

KornevNikita requested a review from a team as a code owner August 20, 2025 16:40

gmlueck reviewed Aug 20, 2025

View reviewed changes

apply suggestions

571c693

steffenlarsen reviewed Aug 22, 2025

View reviewed changes

apply suggestions

eeb908b

KornevNikita requested a review from steffenlarsen August 22, 2025 12:18

small edit

0843199

steffenlarsen reviewed Aug 22, 2025

View reviewed changes

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc Outdated Show resolved Hide resolved

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc Show resolved Hide resolved

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc Show resolved Hide resolved

KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Aug 22, 2025

[SYCL] Implement sycl_ext_oneapi_clock

4f77a65

Spec: intel#19842

KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Aug 22, 2025

[SYCL] Implement sycl_ext_oneapi_clock

5d6d84e

Spec: intel#19842

KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Aug 22, 2025

[SYCL] Implement sycl_ext_oneapi_clock

aa1cc9e

Spec: intel#19842

KornevNikita mentioned this pull request Aug 22, 2025

[SYCL] Implement sycl_ext_oneapi_clock #19858

Merged

apply suggestions

42ab657

steffenlarsen reviewed Aug 26, 2025

View reviewed changes

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc Outdated Show resolved Hide resolved

apply suggestion

4fd1334

steffenlarsen approved these changes Aug 27, 2025

View reviewed changes

gmlueck reviewed Sep 2, 2025

View reviewed changes

zhimingwang36 reviewed Sep 3, 2025

View reviewed changes

bashbaug reviewed Sep 3, 2025

View reviewed changes

sycl/doc/extensions/proposed/sycl_ext_oneapi_clock.asciidoc Outdated Show resolved Hide resolved

KornevNikita added 2 commits September 4, 2025 16:30

various fixes

a01aeb3

updated the overview added new aspects fix some typos

resolve issue

d1f70e7

gmlueck approved these changes Sep 5, 2025

View reviewed changes

aelovikov-intel previously requested changes Sep 5, 2025

View reviewed changes

aelovikov-intel merged commit 25fbd1f into intel:sycl Sep 5, 2025
3 checks passed

[SYCL][DOC] Add new extension sycl_ext_oneapi_clock #19842

[SYCL][DOC] Add new extension sycl_ext_oneapi_clock #19842

Uh oh!

Conversation

KornevNikita commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

steffenlarsen left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

al42and Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

al42and Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

al42and Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Sep 5, 2025

Uh oh!

KornevNikita commented Sep 5, 2025

Uh oh!

aelovikov-intel left a comment

Choose a reason for hiding this comment

Uh oh!

KornevNikita commented Sep 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

KornevNikita commented Aug 20, 2025 •

edited

Loading

al42and Sep 3, 2025 •

edited

Loading

al42and Sep 4, 2025 •

edited

Loading

al42and Sep 5, 2025 •

edited

Loading