Skip to content

Conversation

@0cc4m
Copy link
Collaborator

@0cc4m 0cc4m commented Apr 2, 2025

I don't know how I (and everyone else) missed this, considering it means models are completely incoherent when using the new int dot shaders, but here's the fix. The cache buffer for the quant dm values was too small and overflowed, leading to NaN results.

@0cc4m 0cc4m requested a review from jeffbolznv April 2, 2025 15:29
Copy link
Collaborator

@jeffbolznv jeffbolznv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I didn't try running it.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Apr 2, 2025
@0cc4m 0cc4m merged commit 92e3006 into master Apr 2, 2025
44 checks passed
@0cc4m 0cc4m deleted the 0cc4m/vulkan-mmq-dp4a-fix branch April 2, 2025 17:12
@0cc4m
Copy link
Collaborator Author

0cc4m commented Apr 2, 2025

For some reason this change removed a large chunk of the performance increase of the shader, I'm not sure how that happened. It added 8 bytes of register use, maybe that crossed some occupancy limit, but that is very weird. I hope it can be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants