-
Notifications
You must be signed in to change notification settings - Fork 795
[SYCL][UR] Log sycl-ls error messages related to adapter loading
#17490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
89a3054 to
457a1ce
Compare
457a1ce to
3343425
Compare
3343425 to
c118a99
Compare
First, needs |
c118a99 to
d3d83c0
Compare
thank you for noticing this, done |
d3d83c0 to
5a00185
Compare
I think all of them apply still... |
sycl-ls error messages related to adapter loading
5a00185 to
a5f61e4
Compare
|
@intel/llvm-gatekeepers please merge |
|
@bratpiorka I don't think an adapter missing a dependency is necessarily an error. For example if you download a SYCL nightly build, it will have adapters for CUDA and HIP, but most users won't have both the CUDA and ROCm drivers installed, or even either of them if they're working on Intel devices. In this case they will get confusing error messages from I think we should either go back to hiding that information behind the environment variable, or only showing it for |
In theory, using only environment variables should be sufficient, but in reality, our UMF team encountered a few issues that required debugging to determine whether the problem was with the hardware or the software, and where exactly it was. And hardly anyone knew about the environment variables that can show more. My changes make things more user-friendly - if a user expects an adapter to work and it is not working, an error message is displayed. Would you like me to revert these changes, or do you have an idea on how to get a balance between being user-friendly and over informative? One way to further improve this would be to check which devices are installed on the system and load adapters only for those devices. |
That would require either the adapter to be loaded (for platform discovery) or something like libhwloc. I don't know whether that's realistic doable. I think hiding this behind a |
However, if a user doesn't expect an adapter to work, they are still faced with errors. That's a much more common use case. Example: I just downloaded nightly-2025-03-25 and the first thing |
It could probably be better documented, for what it's worth in the CUDA/HIP release documentation we explain exactly this (see here).
It does in specific cases, but I feel like in most cases it's the opposite as it'll spam users with error messages they don't care about. I do agree that it's very confusing when an adapter doesn't load because of a dependency though.
I think printing additional information in I'd suggest making the change if it's quick and you have time, otherwise we should just revert this for now, and maybe file a ticket for the follow up work. |
|
Would it also make sense for errors to be logged if |
|
@pbalcer @rafbiels @npmiller @RossBrunton please look at #17651 |
Changes:
With this PR, there will be no changes for the user in the default scenario where some adapters exist and some are missing (a missing adapter is not considered an error). In cases where a dependency is missing, the following example message will be displayed:
Here is the message for missing symbols (a case where a dependency is not compatible with the rest of the libraries):