Skip to content

Conversation

lhez
Copy link
Collaborator

@lhez lhez commented Aug 25, 2025

Current supports_op simply returns true for rms_norm while the actual kernel requires ne00 to be multiple of 4 and contiguous. This PR adds these conditions to supports_ops. This also fixes the crash in test-backend-ops for RMS_NORM_MUL_ADD.

@lhez lhez requested a review from max-krasnyansky August 25, 2025 07:09
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Aug 25, 2025
@lhez lhez marked this pull request as ready for review August 25, 2025 18:22
@max-krasnyansky max-krasnyansky merged commit f7207b0 into ggml-org:master Aug 25, 2025
48 checks passed
Minh141120 pushed a commit to menloresearch/llama.cpp that referenced this pull request Aug 26, 2025
Minh141120 pushed a commit to menloresearch/llama.cpp that referenced this pull request Aug 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants