vulkan: optimize UMA buffer operations and fix driver hangs #16059

giuseppe · 2025-09-17T21:54:13Z

The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection.

[32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114]
[32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang

Make sure to read the contributing guidelines before submitting a PR

0cc4m

I forgot to implement this for the memset function, thank you. If it would help to be able to do this asynchronously as well, we could instead implement a "deferred_memset" (see deferred_memcpy for reference), but this would require more changes.

ggml/src/ggml-vulkan/ggml-vulkan.cpp

giuseppe · 2025-09-18T13:53:44Z

If it would help to be able to do this asynchronously as well, we could instead implement a "deferred_memset" (see deferred_memcpy for reference), but this would require more changes.

I've added a new commit to implement deferred_memset similarly to what is done for deferred_memcpy

The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection. [32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114] [32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang Signed-off-by: Giuseppe Scrivano <[email protected]>

Signed-off-by: Giuseppe Scrivano <[email protected]>

0cc4m

Thank you!

…#16059) * vulkan: optimize UMA buffer operations and fix driver hangs The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection. [32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114] [32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang * vulkan: implement deferred_memset on UMA --------- Signed-off-by: Giuseppe Scrivano <[email protected]>

giuseppe requested a review from 0cc4m as a code owner September 17, 2025 21:54

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Sep 17, 2025

0cc4m reviewed Sep 18, 2025

View reviewed changes

ggml/src/ggml-vulkan/ggml-vulkan.cpp Show resolved Hide resolved

giuseppe added 2 commits September 18, 2025 16:11

vulkan: implement deferred_memset on UMA

87d3cd0

Signed-off-by: Giuseppe Scrivano <[email protected]>

giuseppe force-pushed the access-directly-mem-with-uma branch from 34ba782 to 87d3cd0 Compare September 18, 2025 14:11

0cc4m approved these changes Sep 21, 2025

View reviewed changes

0cc4m merged commit 1eeb523 into ggml-org:master Sep 21, 2025
54 of 55 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: optimize UMA buffer operations and fix driver hangs #16059

vulkan: optimize UMA buffer operations and fix driver hangs #16059

Uh oh!

giuseppe commented Sep 17, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

giuseppe commented Sep 18, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vulkan: optimize UMA buffer operations and fix driver hangs #16059

vulkan: optimize UMA buffer operations and fix driver hangs #16059

Uh oh!

Conversation

giuseppe commented Sep 17, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

giuseppe commented Sep 18, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants