Skip to content

Conversation

@jopperm
Copy link
Contributor

@jopperm jopperm commented Mar 13, 2025

@jopperm jopperm self-assigned this Mar 13, 2025
@jopperm jopperm requested review from a team as code owners March 13, 2025 10:10
@jopperm jopperm requested a review from sergey-semenov March 13, 2025 10:10
@jopperm jopperm requested a review from cperkinsintel March 13, 2025 10:11
@jopperm
Copy link
Contributor Author

jopperm commented Mar 13, 2025

@jinge90 I believe that the new bfloat16 device library image mechanism breaks if device binaries are removed again from the program manager. The problem boils down to m_ExportedSymbolImages containing references to destroyed images the next time an image imports the bfloat16 functions.
I've added a workaround for SYCL-RTC in this PR (always loading and cleaning up the bfloat16 device library images), but it would be great if you could look at the problem in general at some point. CC @KseniyaTikhomirova who also had an interest in ProgramManager::removeImages, IIRC.

Signed-off-by: Julian Oppermann <[email protected]>
@jinge90
Copy link
Contributor

jinge90 commented Mar 14, 2025

@jinge90 I believe that the new bfloat16 device library image mechanism breaks if device binaries are removed again from the program manager. The problem boils down to m_ExportedSymbolImages containing references to destroyed images the next time an image imports the bfloat16 functions. I've added a workaround for SYCL-RTC in this PR (always loading and cleaning up the bfloat16 device library images), but it would be great if you could look at the problem in general at some point. CC @KseniyaTikhomirova who also had an interest in ProgramManager::removeImages, IIRC.

Hi, @jopperm
I used dlopen/dlcose code which will trigger removeImage to reproduce a crash, it should be a bug in addImage. Here is a quick fix for the crash issue in my side: #17461 , could you try in your side to see if the issue is gone?
Thanks very much for pointing out this!

@sommerlukas sommerlukas merged commit 77e110d into intel:sycl Mar 14, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants