-
Notifications
You must be signed in to change notification settings - Fork 31
Add wait to MKL calls #518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/deps/generate_interfaces.jl b/deps/generate_interfaces.jl
index beb87ea..9e7bd3f 100644
--- a/deps/generate_interfaces.jl
+++ b/deps/generate_interfaces.jl
@@ -447,17 +447,17 @@ function generate_cpp(library::String, filename::Vector{String}, output::String;
write(oneapi_cpp, "extern \"C\" $header {\n")
if template
type = version_types[version]
- !occursin("scratchpad_size", name) && write(oneapi_cpp, " auto status = oneapi::mkl::$library::$variant$name<$type>($parameters, {});\n device_queue->val.wait_and_throw();\n")
- occursin("scratchpad_size", name) && write(oneapi_cpp, " int64_t scratchpad_size = oneapi::mkl::$library::$variant$name<$type>($parameters);\n device_queue->val.wait_and_throw();\n")
- # !occursin("scratchpad_size", name) && write(oneapi_cpp, " auto status = oneapi::mkl::$library::$variant$name<$type>($parameters, {});\n")
- # occursin("scratchpad_size", name) && write(oneapi_cpp, " int64_t scratchpad_size = oneapi::mkl::$library::$variant$name<$type>($parameters);\n")
+ !occursin("scratchpad_size", name) && write(oneapi_cpp, " auto status = oneapi::mkl::$library::$variant$name<$type>($parameters, {});\n device_queue->val.wait_and_throw();\n")
+ occursin("scratchpad_size", name) && write(oneapi_cpp, " int64_t scratchpad_size = oneapi::mkl::$library::$variant$name<$type>($parameters);\n device_queue->val.wait_and_throw();\n")
+ # !occursin("scratchpad_size", name) && write(oneapi_cpp, " auto status = oneapi::mkl::$library::$variant$name<$type>($parameters, {});\n")
+ # occursin("scratchpad_size", name) && write(oneapi_cpp, " int64_t scratchpad_size = oneapi::mkl::$library::$variant$name<$type>($parameters);\n")
else
if !(name ∈ void_output)
write(oneapi_cpp, " auto status = oneapi::mkl::$library::$variant$name($parameters, {});\n")
- occursin("device_queue", parameters) && write(oneapi_cpp, " device_queue->val.wait_and_throw();\n")
+ occursin("device_queue", parameters) && write(oneapi_cpp, " device_queue->val.wait_and_throw();\n")
else
write(oneapi_cpp, " oneapi::mkl::$library::$variant$name($parameters);\n")
- occursin("device_queue", parameters) && write(oneapi_cpp, " device_queue->val.wait_and_throw();\n")
+ occursin("device_queue", parameters) && write(oneapi_cpp, " device_queue->val.wait_and_throw();\n")
end
end
if occursin("scratchpad_size", name) |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #518 +/- ##
=======================================
Coverage 81.73% 81.73%
=======================================
Files 44 44
Lines 2540 2540
=======================================
Hits 2076 2076
Misses 464 464 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Can you elaborate? Making these calls synchronous is not something we want, and synchronization should be handled on the Julia side already when accessing the destination |
|
Oh ok. We observed wrong values in Krylov.jl when multiple MKL calls are called in series. Upon reviewing the Intel MKL examples, it appears that the MKL calls are asynchronous and not ordered in the SYCL queue. You'd need to synchronize with the returned event. And the event is not returned to Julia and can't be passed to the following MKL calls. Do you think we should do that then? |
|
oneAPI.jl/lib/level-zero/cmdqueue.jl Line 13 in 48d1750
flags=0 is out-of-order. Do you think we should change this to in order? @maleadt
|
It does seem enticing, but the "To be used only when creating immediate command lists" doesn't apply here, so I'm not sure. Maybe we should ping some people at Intel, or open some issue upstream to figure out what's the best way to emulate CUDA's stream-ordered operations without having to use events everywhere. cc @kballeda |
We observed norm()=0 on oneAPI arrays when it shouldn't. Upon review, it seems one needs a wait after each MKL call. They support event dependencies etc.
We have still issues with correctness of axpy in https://github.com/JuliaSmoothOptimizers/KrylovPreconditioners.jl