Skip to content

Conversation

@IMbackK
Copy link
Collaborator

@IMbackK IMbackK commented Feb 12, 2025

My intuition was that if the rocblas path is faster on CDNA with mfma disabled, its likely also faster on GCN gpus, as these are very similar.

After testing and discussion with @cb88 on Vega10 (GFX900) and Vega20 (GFX906) it however turns out to not be the case.
Further complicating things, the change in intended code path did not take effect until d6d24cd masking the fact that this is a pessimization at first.

@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Feb 12, 2025
@IMbackK IMbackK merged commit 5c4284d into ggml-org:master Feb 12, 2025
46 checks passed
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants