Skip to content

[SYCL][Graph] Allow capturing restricted host tasks in native recording mode#22143

Draft
adamfidel wants to merge 10 commits into
intel:syclfrom
adamfidel:adam/host-task-native-recording
Draft

[SYCL][Graph] Allow capturing restricted host tasks in native recording mode#22143
adamfidel wants to merge 10 commits into
intel:syclfrom
adamfidel:adam/host-task-native-recording

Conversation

@adamfidel
Copy link
Copy Markdown
Member

@adamfidel adamfidel commented May 27, 2026

This PR allows SYCL Graph native recording mode to capture the new restricted host task defined in the sycl_ext_oneapi_enqueue_functions extension.

@adamfidel adamfidel force-pushed the adam/host-task-native-recording branch from 25ba6de to 1e41cd9 Compare May 28, 2026 16:39
adamfidel and others added 3 commits May 28, 2026 12:07
…d commands

Move EnqueueHostTaskData and NativeHostTask from the anonymous namespace in
commands.cpp into host_task.hpp so both the native recording path in handler.cpp
and the scheduler path in commands.cpp use the same type and callback rather
than duplicating the pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@adamfidel adamfidel marked this pull request as ready for review May 28, 2026 19:10
@adamfidel adamfidel requested review from a team as code owners May 28, 2026 19:10
@adamfidel adamfidel requested review from mmichel11 and slawekptak May 28, 2026 19:10
Comment thread sycl/source/handler.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if the enqueue host task result is a success code? If it fails for any reason, then I think this could cause a leak because the host task will never free the allocation.

Comment thread sycl/source/detail/host_task.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the allocation gets destroyed after the user's host task completes. What happens with multiple replays of the graph? Do we need some mechanism to keep HostTaskData alive until the graph is destroyed?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this API work with command buffer partitions?

If we want it to not partition the graph, we need to add urCommandBufferAppendHostTaskExp + SYCL logic, but I'm curious if it gets handled correctly with a graph partition.

Comment thread sycl/source/handler.cpp Outdated
auto CallbackData = std::make_unique<detail::EnqueueHostTaskData>(
detail::HandlerAccess::getHostTaskFunc(*HT->MHostTask));
// Store callback in the graph so it is available during replays
auto GraphImpl = Queue->getNativeRecordingGraph();
Copy link
Copy Markdown
Contributor

@mmichel11 mmichel11 May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to consider the case where the queue is being recorded due to a fork in the graph. In this case, MNativeRecordingGraph should be empty as it was not the one begin_recording was called on.

I think caching the native graph is still viable, but we might need to combine it with UR get graph and the ur - > command_graph map we keep in the context. We would also need some way to clear these graph handles on end_recording for these non primary queues.

@adamfidel adamfidel requested a review from a team as a code owner May 29, 2026 22:40
@adamfidel adamfidel marked this pull request as draft May 29, 2026 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants