-
Notifications
You must be signed in to change notification settings - Fork 124
[NATIVECPU] Initial implementation of events on Native CPU #2153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
454b455 to
53e2bc8
Compare
53e2bc8 to
ec5e8ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CTS changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some potential performance issues (see comments) which can probably be addressed in a subsequent PR. Otherwise LGTM
| using args_index_t = std::vector<void *>; | ||
| args_index_t Indices; | ||
| std::vector<size_t> ParamSizes; | ||
| std::vector<bool> OwnsMem; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably put these into one vector for performance
| new ur_event_handle_t_(hQueue, UR_COMMAND_MEM_BUFFER_MAP); | ||
| event->tick_start(); | ||
| *ppRetMap = hBuffer->_mem + offset; | ||
| event->tick_end(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to just time fast pointer arithmetic (because it's just enqueue) which is probably less than the overhead of std::lock_guard in tick_end. Just creating the event like in urEnqueueMemUnmap below is probably fine.
Implements
ur_event_handle_t_, allowing for recording time stamps for events and asynchronous kernel execution.Fixes bugs in handling kernel arguments that have been exposed by the asynchronous execution.
intel/llvm PR: intel/llvm#15564